MogileFS is an open source distributed filesystem, flexible and high available on a network of commodity hardware.
MogileFS is an anagram for "OMG Files" and was created for LiveJounal to handle the storage, replication and retrieval of the large amount of file uploads. MogileFS is a Danga Interactive project. Six Apart has acquired Danga Interactive in 2006.
Who used MogileFS : LiveJounal, Digg, Skyrock, Wikispaces, Friendster
Key Enablers
- A scalable, Fault tolerant, High performance distributed file system
- No Single Point of Failure
- Automatic file replication (3 replications recommanded)
- Better than RAID
- Flat NameSpace
- Share-Nothing
- No RAID required
- Local filesystem agnostic
- Tracker client transfert (mogilefsd) - Replication -- Deletion - Query - Reaper - Monitor
- Files are broken up and spread over the Storage Node (mogstored) HTTP and WebDAV server
- Database MySQL stores the MogileFS metadata (the namespace, and which files are where)
- Client Library : Ruby, Perl, Java, Python, PHP…
- For increasing the high availability of the MogileFS it is possible to interconnect 2 database servers (active/passive) with Solaris Cluster
- 2 Trackers nodes for availability and one for the load balancing
Security
- For the security of the MogileFS cluster you should encrypted the data for safeguarding all transactions on the web.
Proof Of Concept
- Create an architecture with three servers (tracker, database, storage node) and test the performance and the feasibility of MogileFS.
- For rapidly testing MogileFS you can create 3 Solaris Containers (tracker, database, storage node) on the same physical server.
- Interface your application with MogilesFS and implement the "Save as Cloud..." and "Open from Cloud...". functionalities.
Service and Support
- MogileFS support with http://www.sixapart.com
Architecture Overview
Hardware Sizing
- Useable Data Volume : Customer data volume (customer need)
- Number of replication blocks : 3 (recommanded)
- 2 CPU quad-core
- 16 GB RAM minimum
- Work Data Volume (metadata, namespace...)
- Raw Data Volume : (Useable Data Volume * Nb of replication blocks) + Work Data Volume
- Number of cluster nodes : Max (Number of replication blocks, (Number of Trackers + Number of Storage Nodes))
- No RAID factor, No HBA port
Hardware Configuration
- 2 Database servers : 2CPU Quad-Core / 32GB RAM/ 2HD 146GB SAS/ 4 GbE Ports
with Sun StorageTek 12 Disks SAS 146GB RAID1 and Solaris Cluster for high availability
- 3 Trackers servers 2CPU Quad-Core / 32GB RAM/ 2HD 146GB SAS/ 4 GbE Ports
- 4 Storage nodes servers with 2CPU Quad-Core / 32GB RAM/ 48HD 1TB SATAII / 4 GbE Ports
- No fiber channel interface for trackers and storage nodes connection is needed because the network protocol is Ethernet.
MogileFS Presentation
Download
http://www.sixapart.com
http://www.danga.com/mogilefs



I must say this is a great article i enjoyed reading it keep the good work.
Posted by true religion jeans on November 02, 2009 at 04:37 AM CET #
Excellent Article!
Can you explain why one might need 32 GB RAM minimum?
Regards
Rajan
Posted by Rajan on November 03, 2009 at 09:50 AM CET #