The original is at this chinese blog:
I'm gonna try to fix the translations here:
useradd -g mfs mfs
Compile and install
./configure --prefix = / usr --sysconfdir = / etc --localstatedir = / var / lib --with-default-user = mfs --with-default-group = mfs --disable-mfschunkserver --disable- mfsmount
make && make install
- Modify the configuration
Initial configuration file
cp mfsmaster.cfg.dist mfsmaster.cfg
cp mfsmetalogger.cfg.dist mfsmetalogger.cfg
cp mfsexports.cfg.dist mfsexport.cfg
Set access permissions
Changes in the mfsexports.cfg
Adding target address.
- Initialize metadata file
Note: just at the right time for initialization only need the file.
cp metadata.mfs.empty metadata.mfs
- Modify hosts
Add IP addresses to mfsmaster
web gui monitoring
mfscgiserv run you can check the operation status of the port in 9425.
Theoretically metaloggerserver machines require more powerful machines can. The server for backup use.
- Creating user groups and users
- ./configure --prefix = / usr --sysconfdir = / etc --localstatedir = / var / lib --with-default-user = mfs --with-default-group = mfs --disable-mfschunkserver --disable- mfsmount && make && make install
cp mfsmetalogger.cfg.dist mfsmetalogger.cfg
./configure --prefix = / usr --sysconfdir = / etc --localstate = / var / lib --with-default-user = mfs --with-default-group = mfs --distco-mfsmaster && make && make install
cp mfschunkserver.cfg.dist mfschunkserver.cfg
cp mfshdd.cfg.dist mfshdd.cfg
Create a directory / mnt / mfschunks1
- -a automatically added
- -F Start fork conducted mount (so you can at the same time, but the failure will not affect the other machines)
mfsmount / mnt / mfs fuse mfsmaster = MFSMASTER_IP, mfsport = MFSMASTER_PORT, _netdev 0 0
netdev note, this is a network device must only be activated in the network under normal circumstances.
autofs is a tool for automatic file system is mounted.
This is a very important one
mfsmount / mnt / mfs -H mfsmaster
Retrieve lost files
Startup and Shutdown
Starting MooseFS cluster
The safest way to start MooseFS (avoiding any read or write errors, inaccessible data or similar problems) is to run the following commands in this sequence:
- start mfsmaster process
- start all mfschunkserver processes
- start mfsmetalogger processes (if configured)
- when all chunkservers get connected to the MooseFS master, the filesystem can be mounted on any number of clients using mfsmount (you can check if all chunkservers are connected by checking master logs or CGI monitor).
Stopping MooseFS cluster
To safely stop MooseFS:
- unmount MooseFS on all clients (using the umount command or an equivalent)
- stop chunkserver processes with the mfschunkserver stop command
- stop metalogger processes with the mfsmetalogger stop command
- stop master process with the mfsmaster stop command.
Maintenance of MooseFS chunkservers
Provided that there are no files with a goal lower than 2 and no under-goal files (what can be checked by mfsgetgoal -r and mfsdirinfo commands), it is possible to stop or restart a single chunkserver at any time. When you need to stop or restart another chunkserver afterwards, be sure that the previous one is connected and there are no under-goal chunks.
Accelerating the speed of recovery
Speeding up chunk replication and re-balancing.
By default MooseFS yields higher I / O to the file system operations for clients and uses very little I / O for chunk replication and re-balancing. This is mostly preferable in normal operation. However when you replace an existing chunk server, then some of the chunks may be under goal and need a quick attention. If you want to speed up chunk replication process by sacrificing I / O for other file system operations, then tweak the following two parameters in mfsmaster.cfg file on a master server and restart master server process. You may have to experiment little bit to find a correct balance between chunk replication rate and available I / O for other file system operations.
CHUNKS_WRITE_REP_LIMIT = 5 #default value 1 CHUNKS_READ_REP_LIMIT = 25 #default value 5
mfs use Cookbook
mfs is a distributed storage system.
The main features
- Distributed storage, distributed freely to bring the extended features.
- Part of the multi-host backup feature brings crashes can still provide services that can restore service characteristics.
two features two capacity upgrades are dependent on the above
- File read, you can use multiple hosts speed advantage.
Problems currently found
- Crashes caused by sudden power failure may cause metalogger damage to the storage system, thereby causing the file is missing. Missing file recovery, there is no conventional methods.
mfs structure introduced
For existing broad line of storage systems company mfs
mfs startup policy
- mfsmaster start
On mfsmaster node, run
- mfschunkserver turn starts
ChunkSvr on each node, run
- mfsmetalogger start
On mfsmetalogger node, run
- Other client node to mount
In any node on the same network, the first in the / etc / hosts to add a line
Then, you can run
mfsmount -H mfsmaster PATH , where PATH is you want to mount the destination address.
- In existing systems, master, metalogger, chunkSvr have configured the boot from the start, so by default, is a direct boot, it can be restored.
- In the existing system, mount nodes, but also can not automatically recover, you may need to manually handle.
has found a way to automatically, you can try to increase the network device parameters mfsmount command ~ ~
mfs system operating environment is openSuse environment under the boot of the environment can be placed /etc/rc.d/boot.local. If there are other self-starting tasks, but also add to this place.
By mfsmount mount, you can exactly like native file access as access mfs file system resources.
Mfs existing system, which offers two services.
ftp service, for writing the new transcoding videos, this is generally 192.168.1.67:/var/pub/data path below.
Streaming services, streaming video resources for reading, the general flow of the server / data directory.
mfs Monitoring and Management
mfs monitoring system provides a web page, you can totally see the various states need to check the background of the general state of the following elements:
|Info||mfs internal file chunk, as well as backup situation|
|Server / Disk||ChunkSvr work of subordinates (focus on its hard disk occupancy)|
|Server||Operation of subordinate metalogger|
|Mounts||mfs system outward circumstances client access|
|Operations||Count client and file operations conducted|
|Master chart / Server Chart||Observe various runtime conditions, design CPU, memory usage, disk usage and a series of information.|
Solutions to common problems
If mfsmaster stops unexpectedly, and how to restore?
There are two common ways to recover.
1, a node running on mfsmaster
mfsmetarestore -a , for recovery.
2, copy the appropriate files from metalogger meta node, and then try to recover.
How to diagnose the entire system
General found that the storage system problems, is not the time to provide storage services.
This time, first under the circumstances to ensure that network access is no problem, try to access http://192.168.1.137:9425 page.
If the page can not be opened, on the master node, manually start
mfscgiserv Start . then manually start, and then try to access the page.
Through this page, the system checks whether there is a fault mfs if some of them chunkSvr not open, connect to the appropriate machine, manually open chunkSvr service.
Under ensure that all chunkSvr normal circumstances, please check web page
Mounts label. mount case observed node if you need to store the server node is not mounted, go to the appropriate client nodes were
mfsmount PATH -H mfsmaster commands.
Unable ls file system, which is the input ls will shell will be stuck, or the file does not exist under the corresponding mount path
- This problem is usually mount point exception.
- Please umount / data or umount / var / ftp / pub
- After re-mount the corresponding path
- If not eased by umount this method to release the mount point. Then you can directly kill mfsmount process, forced off the mount point. Killall -9 mfsmount
Mount, there can not connect 192.168.1.137 9421 ... similar words
- This is the main storage server is not started, start the primary storage server
Mount, there can not ... data
- This is more than half of the file block storage server exception, restart the startup file block storage server
mfsmount whether caching policy
Some. Mfsmount will cache the current data from chunkserver transmission, and cached locally.
The default buffer space for
250MB In fact, the code which is 128MB. Range can be set for 16MB ~ 2GB. We can set to the maximum value in practice.
Meanwhile, in mfs, there is the concept of cache mode.
AUTO default mode.
NO, do not be cached.
YES, always be cached (not recommended)
mfsmount each will request the corresponding meta-information (meta-information cache expiration time for 1s)
Similarly, the number of moosefs, open the file, there is a limit. Defaults to 100K file descriptors.
More mfsmount processes are not mutually shared cache.
Even mfsmount process from the same local IP configuration, not to mention the different IP configured.
mfsmount encounter --no-cano ... sort of mistake is what does this mean?
Note mfsmount call fuse (right fuse has some version requirements)
The fuse depends on the mount system.
This is the argument began after the fuse of 2.8.6 using force, but may in some relatively low version 2.17 of the mount (util-linux) module inside, yet this parameter. Therefore can not identify this parameter.
This parameter is used to disable after mount, mount the file paths cross, soft connection.
- Upgrade your util-linux
- Downgrade your fuse
- Modify fuse source, this command parameter fuse / lib / mount_util.c this file can be deleted directly.
mfsmount multiple network adapters
With a lot of cards, want to expand their bandwidth, this is not so simple plug the Internet cable, configure ip can be completed.
linux following a local routing rules, so you must configure the local routing rules, we can make the data from the card we need to go out, or default, default data are from the first card eth0 out. simply can not effectively use the NIC bandwidth The.
Configure the local routing rules approach:
- Add card, set the IP
- echo 12 eth1table >> / etc / iproute2 / rt_tables
- The rule is generally true:
ip route add default dev eth1 src 192.168.1.2 table eth1table
What this means is that the routing rules, from 192.168.1.2, to any address in the packet, you can link the device eth1 notice sent.
- Application Rules
ip rule add from 192.168.1.2/32 table eth1table
192.168.1.2 data from the routing table eth0table.
- Applications to take effect
ip route flush cache
mfsmount, cache part, temporary in-memory data structure for cblock_s type. Each block of data stored in 1K or 10K This macro in mfscommon / MFSCommunication.h in This size affects all chunkserver and master, not random changes. (in fact, for the purposes of our system is to increase the size of the block, e.g., to 4M, are possible)
Here you can modify so that it get more memory, more than 2048.
And each unit can be even greater. To better meet the needs of our static files.
This part of the code in mfsmount / main.c and mfsmount / writedata.c middle.
2014-01-03 11:46:25 only write cache, this can not be right !!! tests also did see a read cache ah !!!
Use an external tool DM-cache, EnhanceIO, Bcache other tools.
But it seems these are used in the above chunkserver improve read and write speeds.
Generally use 12 / 16GB of memory .4 / 6 mechanical hard drive, 60GB / 120GB of SSD, and then divided into 4/6 SSD partition (corresponding to the number of hard disk), plus a system partition.
Specific content on this part, reference ssd_cache
2014-01-15 11:00:10 mfsmaster did not start successfully (cgi, and various chunkserver are normal)
Did not lead to a successful start mfsmount