MooseFS howto


The original is at this chinese blog:

mfs usage


I'm gonna try to fix the translations here:


mtime: 2014-03-19 20:50:07 by liuyang1 @ USTC

mfs use


mfsmsater install

  1. Installation

Add users.

groupadd mf

useradd -g mfs mfs

Compile and install

./configure --prefix = / usr --sysconfdir = / etc --localstatedir = / var / lib --with-default-user = mfs --with-default-group = mfs --disable-mfschunkserver --disable- mfsmount

make && make install

  1. Modify the configuration

Initial configuration file

cp mfsmaster.cfg.dist mfsmaster.cfg

cp mfsmetalogger.cfg.dist mfsmetalogger.cfg

cp mfsexports.cfg.dist mfsexport.cfg

Set access permissions

Changes in the mfsexports.cfg

Adding target address.

  1. Initialize metadata file

Note: just at the right time for initialization only need the file.

cp metadata.mfs.empty metadata.mfs

  1. Modify hosts

Add IP addresses to mfsmaster

web gui monitoring

mfscgiserv run you can check the operation status of the port in 9425.

metaloggerserver installation

Theoretically metaloggerserver machines require more powerful machines can. The server for backup use.

  1. Creating user groups and users
  2. ./configure --prefix = / usr --sysconfdir = / etc --localstatedir = / var / lib --with-default-user = mfs --with-default-group = mfs --disable-mfschunkserver --disable- mfsmount && make && make install

cp mfsmetalogger.cfg.dist mfsmetalogger.cfg

chunkserver installation

./configure --prefix = / usr --sysconfdir = / etc --localstate = / var / lib --with-default-user = mfs --with-default-group = mfs --distco-mfsmaster && make && make install

cp mfschunkserver.cfg.dist mfschunkserver.cfg

cp mfshdd.cfg.dist mfshdd.cfg

  1. Configuration

Create a directory / mnt / mfschunks1



  • -a automatically added
  • -F Start fork conducted mount (so you can at the same time, but the failure will not affect the other machines)

mfsmount / mnt / mfs fuse mfsmaster = MFSMASTER_IP, mfsport = MFSMASTER_PORT, _netdev 0 0

netdev note, this is a network device must only be activated in the network under normal circumstances.

autofs is a tool for automatic file system is mounted.


This is a very important one

mfschunkserver start

mfsmaster start

mfsmount / mnt / mfs -H mfsmaster

Retrieve lost files


Startup and Shutdown

Starting MooseFS cluster

The safest way to start MooseFS (avoiding any read or write errors, inaccessible data or similar problems) is to run the following commands in this sequence:

  • start mfsmaster process
  • start all mfschunkserver processes
  • start mfsmetalogger processes (if configured)
  • when all chunkservers get connected to the MooseFS master, the filesystem can be mounted on any number of clients using mfsmount (you can check if all chunkservers are connected by checking master logs or CGI monitor).

Stopping MooseFS cluster

To safely stop MooseFS:

  • unmount MooseFS on all clients (using the umount command or an equivalent)
  • stop chunkserver processes with the mfschunkserver stop command
  • stop metalogger processes with the mfsmetalogger stop command
  • stop master process with the mfsmaster stop command.

Maintenance of MooseFS chunkservers

Provided that there are no files with a goal lower than 2 and no under-goal files (what can be checked by mfsgetgoal -r and mfsdirinfo commands), it is possible to stop or restart a single chunkserver at any time. When you need to stop or restart another chunkserver afterwards, be sure that the previous one is connected and there are no under-goal chunks.

Accelerating the speed of recovery

Speeding up chunk replication and re-balancing.

By default MooseFS yields higher I / O to the file system operations for clients and uses very little I / O for chunk replication and re-balancing. This is mostly preferable in normal operation. However when you replace an existing chunk server, then some of the chunks may be under goal and need a quick attention. If you want to speed up chunk replication process by sacrificing I / O for other file system operations, then tweak the following two parameters in mfsmaster.cfg file on a master server and restart master server process. You may have to experiment little bit to find a correct balance between chunk replication rate and available I / O for other file system operations.

CHUNKS_WRITE_REP_LIMIT = 5 #default value 1 CHUNKS_READ_REP_LIMIT = 25 #default value 5


mfs use Cookbook

mfs is a distributed storage system.

The main features

  • Distributed storage, distributed freely to bring the extended features.
  • Part of the multi-host backup feature brings crashes can still provide services that can restore service characteristics.two features two capacity upgrades are dependent on the above
  • File read, you can use multiple hosts speed advantage.

Problems currently found

  • Crashes caused by sudden power failure may cause metalogger damage to the storage system, thereby causing the file is missing. Missing file recovery, there is no conventional methods.

mfs structure introduced

              + --------- + + ------------ +
              | Master + ------- + metalogger |
              + ---- + ---- + + ------------ +
        + ---------- + ------------ + -------- +
  + ----- + ---- + + --- + ---- + + ---- + --- + + - + - +
  | ChunkSvr | | chunkSvr | | chunkSvr | | ... |
  + ---------- + + -------- + + -------- + + --- +



For existing broad line of storage systems company mfs


mfs startup policy

  • mfsmaster start

On mfsmaster node, run mfsmaster start

  • mfschunkserver turn starts

ChunkSvr on each node, run mfschunkserver start

  • mfsmetalogger start

On mfsmetalogger node, run mfsmetalogger start

  • Other client node to mount

In any node on the same network, the first in the / etc / hosts to add a line mfsmaster

Then, you can run mfsmount -H mfsmaster PATH , where PATH is you want to mount the destination address.


  1. In existing systems, master, metalogger, chunkSvr have configured the boot from the start, so by default, is a direct boot, it can be restored.
  2. In the existing system, mount nodes, but also can not automatically recover, you may need to manually handle. has found a way to automatically, you can try to increase the network device parameters mfsmount command ~ ~

mfs system operating environment is openSuse environment under the boot of the environment can be placed /etc/rc.d/boot.local. If there are other self-starting tasks, but also add to this place.

mfs use

By mfsmount mount, you can exactly like native file access as access mfs file system resources.

Mfs existing system, which offers two services.

ftp service, for writing the new transcoding videos, this is generally path below.

Streaming services, streaming video resources for reading, the general flow of the server / data directory.

mfs Monitoring and Management

mfs monitoring system provides a web page, you can totally see the various states need to check the background of the general state of the following elements:

LabelCovering content
Infomfs internal file chunk, as well as backup situation
Server / DiskChunkSvr work of subordinates (focus on its hard disk occupancy)
ServerOperation of subordinate metalogger
Mountsmfs system outward circumstances client access
OperationsCount client and file operations conducted
Master chart / Server ChartObserve various runtime conditions, design CPU, memory usage, disk usage and a series of information.

Solutions to common problems

If mfsmaster stops unexpectedly, and how to restore?

There are two common ways to recover.

1, a node running on mfsmaster mfsmetarestore -a , for recovery.

2, copy the appropriate files from metalogger meta node, and then try to recover.

How to diagnose the entire system

General found that the storage system problems, is not the time to provide storage services.

This time, first under the circumstances to ensure that network access is no problem, try to access page.

If the page can not be opened, on the master node, manually start mfscgiserv Start . then manually start, and then try to access the page.

Through this page, the system checks whether there is a fault mfs if some of them chunkSvr not open, connect to the appropriate machine, manually open chunkSvr service. mfschunkserver Start

Under ensure that all chunkSvr normal circumstances, please check web page Mounts label. mount case observed node if you need to store the server node is not mounted, go to the appropriate client nodes were mfsmount PATH -H mfsmaster commands.

Unable ls file system, which is the input ls will shell will be stuck, or the file does not exist under the corresponding mount path

  • This problem is usually mount point exception.
    • Please umount / data or umount / var / ftp / pub
    • After re-mount the corresponding path
  • If not eased by umount this method to release the mount point. Then you can directly kill mfsmount process, forced off the mount point. Killall -9 mfsmount

Mount, there can not connect 9421 ... similar words

  • This is the main storage server is not started, start the primary storage server

Mount, there can not ... data

  • This is more than half of the file block storage server exception, restart the startup file block storage server

mfsmount whether caching policy

Some. Mfsmount will cache the current data from chunkserver transmission, and cached locally.

The default buffer space for250MB In fact, the code which is 128MB. Range can be set for 16MB ~ 2GB. We can set to the maximum value in practice.

Meanwhile, in mfs, there is the concept of cache mode.

AUTO default mode.

NO, do not be cached.

YES, always be cached (not recommended)

mfsmount each will request the corresponding meta-information (meta-information cache expiration time for 1s)

Similarly, the number of moosefs, open the file, there is a limit. Defaults to 100K file descriptors.

More details

More mfsmount processes are not mutually shared cache.

Even mfsmount process from the same local IP configuration, not to mention the different IP configured.

mfsmount encounter --no-cano ... sort of mistake is what does this mean?

Note mfsmount call fuse (right fuse has some version requirements)

The fuse depends on the mount system.

This is the argument began after the fuse of 2.8.6 using force, but may in some relatively low version 2.17 of the mount (util-linux) module inside, yet this parameter. Therefore can not identify this parameter.

This parameter is used to disable after mount, mount the file paths cross, soft connection.

The solution:

  1. Upgrade your util-linux
  2. Downgrade your fuse
  3. Modify fuse source, this command parameter fuse / lib / mount_util.c this file can be deleted directly.

mfsmount multiple network adapters

With a lot of cards, want to expand their bandwidth, this is not so simple plug the Internet cable, configure ip can be completed.

linux following a local routing rules, so you must configure the local routing rules, we can make the data from the card we need to go out, or default, default data are from the first card eth0 out. simply can not effectively use the NIC bandwidth The.

Configure the local routing rules approach:

  1. Add card, set the IP
  2. echo 12 eth1table >> / etc / iproute2 / rt_tables
  3. The rule is generally true:

ip route add default dev eth1 src table eth1table

What this means is that the routing rules, from, to any address in the packet, you can link the device eth1 notice sent.

  1. Application Rules

ip rule add from table eth1table data from the routing table eth0table.

  1. Applications to take effect

ip route flush cache

Internal inquiry


mfsmount, cache part, temporary in-memory data structure for cblock_s type. Each block of data stored in 1K or 10K This macro in mfscommon / MFSCommunication.h in This size affects all chunkserver and master, not random changes. (in fact, for the purposes of our system is to increase the size of the block, e.g., to 4M, are possible)

Here you can modify so that it get more memory, more than 2048.

And each unit can be even greater. To better meet the needs of our static files.

This part of the code in mfsmount / main.c and mfsmount / writedata.c middle.

2014-01-03 11:46:25 only write cache, this can not be right !!! tests also did see a read cache ah !!!

SSDcache expansion

Use an external tool DM-cache, EnhanceIO, Bcache other tools.

flashcache tools.

But it seems these are used in the above chunkserver improve read and write speeds.


Generally use 12 / 16GB of memory .4 / 6 mechanical hard drive, 60GB / 120GB of SSD, and then divided into 4/6 SSD partition (corresponding to the number of hard disk), plus a system partition.

Specific content on this part, reference ssd_cache

2014-01-15 11:00:10 mfsmaster did not start successfully (cgi, and various chunkserver are normal)

Did not lead to a successful start mfsmount