BSD Server Maintenance

}

September 30, 2013

BSD Magazine Article

Many sites and handbooks explain how to install servers, but once a server is running, can an administrator keep it that way? This article will cover the basics of maintaining a BSD server.

Many server administrators may understand how to set up a FreeBSD server, but once it is up, can they keep it running? Many companies may store business transactions on servers like stock lists and item prices. If the server(s) goes down, a part of or the entire company is unable to continue or perform well until the server is back up. Also, if the server is used to host a website that sells products, customers cannot make purchases until the server is fixed. This means that the company loses money until the server is back online. If the data is lost, this means the company loses more profits. Clearly, it is important to keep the servers healthy and the data safe and secure.

When dealing with servers, the two most important directories that all administrators should thoroughly understand and learn are /etc/ and /var/.
NOTE: I am using FreeBSD v9.1 for my descriptions and examples, but this article is valid for any BSD distro and many Linux/Unix operating systems.

Logs
When maintaining a server, it is best to prevent issues before problems arise. Thankfully, many servers have a log system. Most applications/services have a log that lists the software’s status and errors. When an error is spotted, the administrator should take the time to investigate and prevent future disasters.
Logs are stored in /var/. The logs are given names that easily identify their content or purpose. For example, the Samba log would be /var/log/smb.log. Some applications (rarely) store logs in the current user’s home folder. If a log cannot be found, check in Root’s home folder or read the application’s manual for the log location.

If the HTTP services are having specific issues and the IT tech does not want to spend a lot of time searching through the Apache log, then the tech can do a specific search like below.
grep SOME_ISSUE_TO_SEARCH /var/log/httpd-error.log

For example, when searching for a connection error, search for words like “connection”, “failed”, “404” (HTTP connection error number), and other similar terms. A server maintainer may use a command like the above to search the boot-up logs for lines containing the word “Ethernet” when there appears to be an Ethernet issue. There is a command specifically for viewing the boot-up log seen below. The command below displays the contents of /var/run/dmesg.boot.
dmesg -a | less
This command will allow the administrator to scroll through the boot messages. When searching for entries concerning the USB system, for instance, try the below command.
dmesg -a | grep usb
It is important to occasionally browse the output of the dmesg command, because sometimes errors may start occurring with the hardware or boot-up process. If the server has problems booting up, the network’s or company’s performance may suffer.

Paper Logs
It is best to keep a physical log in the server room or some other secure storage location. This log should include a list of all files changed, software installed, and hardware repaired along with answers to the following questions – who, what, why, when, where, and how.

Who
It helps to keep a written record of all IT techs that have dealt with the server. This helps keep track of all who may be aware of an error. Assume the server is having errors with the NTP services again. If Bill, the IT tech, was the last one to fix it, it may be best to have Bill check the issue again since he is familiar with it or to ask Bill for suggestions. If the server always tends to break after Andrew makes changes to the server, it may be best to investigate the issue.

What
IT techs may find it helpful to keep track of what other techs are fixing. If there is a network connection issue and the Ethernet devices have already been replaced a few times, then it may be safe to assume the problem lies somewhere else. Also, if the reason techs are checking the server is because of reports of Samba issues, then the Samba services probably need to be completely reinstalled (remove /usr/local/samba and run pkg_add -r samba35).

Why
Knowing why a particular software or hardware is being updated, repaired, and so on, may help narrow down future issue origins when finding the source of an error. For illustration, if Bill reinstalled the NIS services because they repeatedly locked up, then Casey knows that another reinstallation will not fix the same issue that is occurring the next day.

When
The time errors occur or when they were fixed can assist some techs. For example, if the server is always powered up at 5:00am and syslogd crashes at 5:03 am everyday, then most likely a service is starting up that conflicts with syslogd. This would then point to the cron tables, at tables, or /etc/rc* scripts (some BSD systems structure their rc system differently). Any process that starts at 5:03am or a little before may be causing the crash. As another example, if the last backup was performed a week earlier, then the tech looking over the paper log should perform a backup or assign the task to someone else. This time log would also keep administrators from ruling out causes that are indeed the cause. For instance, assume the server has a virus. The virus scanner is not finding the virus for some unknown reason. If the tech sees that the last antivirus update was three months ago, then the tech has a good assumption that updating the anti-virus definitions will enable the scanner to find and remove the virus. Otherwise, the tech will falsely assume the scanner is up-todate and waste precious time looking elsewhere for the fix for the problem.

Where
This may not be a very useful piece of information, but it may help in some instances. This would answer “from where was the server managed/fixed?” Servers can be fixed or repaired locally, remotely via secure shell (SSH), or other methods. If there are suspicions that the server is being hacked from a certain IP address, but it is later seen that the log states Andrew proof-read Bill’s /etc/rc.conf edits through a remote shell using the suspicious IP address, then we can rule out that address.

How
This is an important piece of information. How was the problem fixed and diagnosed? Was the DNS server’shard-drive replaced or reformatted? What command was used? This helps future problems get repaired the right way (if it worked) or differently if the issue still persists. This also helps to inform other techs what has already been done and tried with the server. Also, when making such logs, include which server (if there are more than one) and which OS (if the server is set up for a dual-boot or virtual system). It may help to have a separate file for each server. A log containing errors for an unsolvable issue should be printed out for future reference just in case the server loses its files. It will be more difficult to fix an issue if the logs cannot be viewed.

Finding Problems
If users report that they cannot access files or information, then it may be wise to rule out as many issues as possible. On some of the workstations, type ping <SERVER_IP_ADDRESS>. If the ping fails, then this proves the problem lies in the network connection. To help better pin-point the problem area, type traceroute <SERVER_IP_ADDRESS>. This command will print each location between the workstation and server. If the connection stops at a router, then the router or the devices before or after the router may be the cause of the connection error. If the connection is fine, the last stop should be the server. Once it has been proven that the network is fine, it may mean the server itself is the source of the problem. If the inaccessible files are accessed via FTP, then the administrator may want to make sure the FTP process is still running. In a command-line, type:
pgrep ftpd
If the FTP daemon is running, the administrator should get a number (the PID) as the output. Then, the tech may want to restart the daemon using the command below.
service ftpd restart
If no output is received, then the FTP process closed for some reason. To turn it back on, type:
service ftpd start
The FTP server should be back online. Now would be a good time to investigate what caused the FTP daemon to close. To do this, check the logs. The log file for ftpd is /var/logs/ftpd. This is a plain text file. In a command-line, it can be read in one of many ways. The best way is to use the less command which allows the userto scroll up and down. To see the last ten lines, use the
tail command.
less /var/logs/ftpd
tail -10 /var/logs/ftpd

If the FTP daemon is not starting when the server turns on, then the administrator needs to check /etc/rc.conf and make sure this line is in the configuration file ftpd_enable=”YES”. If not, then that is why the daemon never started; no script told it to start. If for some reason the above still does not help, check some of the permission files. Administrators can set up which users are allowed or not allowed to access certain services. The file /etc/ftpusers lists users that are not permitted to access any service or file provided by the FTP daemon. If many or all users are listed here, then that would explain why no one can access the files. To allow everyone to use the FTP services, type:
echo “” > /etc/ftpusers

NOTE: Remember to keep a backup of the ftpusers file by copying it to the root’s home folder or some other designated backup location.
This will erase the list and allow all users to use the FTP server. If there are some users that should not access the FTP portion of the server, make a list of the users and re-add them to the list using a preferred text editor. Many users use Vi or Emacs in the command-line. To make sure the BSD server is recognizing all of its network devices, type ifconfig and make sure all of the Ethernet ports and wireless devices are listed. If any of the network devices are missing, then the operating system is missing a driver or that device is physically broken.

Checking the File System
If the filesystem goes bad, then the contained data will be damaged and lost. Performing a simple restoration from a backup cannot be done. Look at the filesystem as a landscape or garden. If the soil becomes rocky and bad for the plants, a gardener cannot replant a new plant without making the soil healthy. The same goes for a filesystem. To check a filesystem for errors, use fsck.
fsck -F ufs

The command will check the UFS filesystem for errors. If an error is found, it may be wise to make an immediate backup of important savable files and then reformat or replace the hard-drive. Remember, when checking a filesystem using fsck, specify the filesystem type to be inspected. Otherwise, the check will not work properly if fsck is checking a UFS filesystem while expecting it to be ZFS.
Generally, the best way to prevent or ease the repair of bad filesystems is to use a RAID system with parity. Then, the damaged storage unit can be removed and replaced with a new or repaired unit. FreeBSD will then recreate the data using the parity system.

Problems with Finding Problems
Logs are very useful in solving many problems. However, logs may not always be there to help. For instance, if the server locks up, syslogd (the process that makes logs) will not be able to write the logs. Then when the logs are viewed, nothing will be seen for the sequence of events that led up to the disaster. If malware erases the hard-drive, no logs will be seen. Also, if the hard-drive or filesystem is corrupted, then no logs will be seen either.

GUI
The default FreeBSD installation lacks a graphical user interface (GUI). For a server, the administrator must take some details into consideration before installing a GUI. A GUI would make a system easier to repair and maintain. However, this would make it easier for someone within the company to ruin the server and its data. Also, there would be more software that could cause a conflict with existing programs. If for some reason a GUI must be installed on a server, it is best to install a graphical user interface with a small footprint (uses very little resources). Clearly, KDE, GNOME, Mate, Cinnamon, and Unity are not good choices for desktop environments on a server. Some graphical interfaces suitable for a server include Afterstep, Ratpoison, Enlightenment, Blackbox, Fluxbox, and other similar interfaces. XFCE or LXDE may work well on a server, but it may be best not to install a desktop interface that large on a server.

NOTE: At the time of writing, BSD distros have recently started using graphical user interfaces. However, they must be downloaded, compiled, and installed. BSD systems may have problems with some graphical user interfaces. It is best to avoid desktop interfaces unless it is absolutely necessary.

Quick Fixes
If a server daemon is found to not run on boot-up, of course, it needs to be added to /etc/rc.conf. However, some administrators may not like Vi or Emacs and do not have time to install a preferred text editor. Well, there is a quick fix. To quickly and easily make a daemon start when the system loads up, type the following command: echo “apache22_enable=”YES”” >> /etc/rc.conf

If an entry in /etc/rc.conf is spelled wrong, it can swiftly be corrected. Assume the above command added the misspelled line apache22_enabled=”YES”. Type the below command to make the correction.
cat /etc/rc.conf | sed -i -e ‘s|apachy22|apache22|’
The above command will perform a find-and-replace in file (changes take place instantly). No regex (wildcards) is used, so the exact string will be matched and changed. Unless the administrator is very skilled with regex and has thoroughly read the configuration file, no one should use regex in such an important system file. Otherwise, settings that should be left alone will get changed if the tech is not careful. This can cause the server problems, and the administrator will have to spend time finding and fixing the problem.

Updating
Updates can be good for a server, but they can also be harmful. Updates may offer bug fixes, new abilities, more efficient algorithms, and less resource usage. So, updates may help a server’s performance. However, an update may contain a bug that the developers did not find. This bug may be minor or it could cause the system to be down for a while. Generally, it is best to have a “testing server”. This server would be exactly like the main server, but the testing server has the latest updates. Server administrators would use such a server to test out new configuration settings on services and make sure that updates and new software work properly. However, some companies may not have the funds to have this testing system. It may also help to watch the Internet for reports on major bugs.
An alternative to a testing server is to have a virtual testing server. Install visualization software on a computer/operating-system of choice. Then, install BSD and test the newest updates and such on this system. This will not be a perfect test because the hardware is not the same as the server. BSD distros run very well in virtual machines, so no problem should exist here. When updating the system, type:
freebsd-update fetch; freebsd-update install

This will update the list of available software and then update the software. You must be root to apply such updates.

Log Space
Over time, logs will consume a lot of space in /var/. To reduce the disk usage, remove the logs. This can be done in a number of ways. If your company requires the logs be stored, then get a USB hard-drive and move the logs to the drive.
On a command-line, you could also run the logrotate command. This utility will compress old or large logs and give the system new, empty files to start writing more logs. To empty a single log, type a command like this:
echo “” > /var/logs/SOME_LOG
This will empty the log and keep the file without making a copy or compressing the file. Before removing logs, check for any recent activity that should be noted.

Security
Viruses and hackers may try to destroy the system from the outside, or people physically near the server may cause harm. The server must be protected physically and at the software and network level. The server and network system can be secured at various levels in numerous ways.
For physical security, it is best to keep the server room locked and (if funding is sufficient) set up security cameras. Large companies with very important servers may want to consider hiring security guards. There are many other ways to secure the server physically, but that is beyond the scope of this magazine.
If local computers communicate with the server(s) via Wifi, the Wifi should use an encrypted signal and (if supported by the wifi router) enable MAC address filtering.
If a script needs to be added to the /etc/rc* system, thoroughly review the script and only place executables here from trusted developers. The rc utility starts scripts at boot time, at shutdown, and during other important events. If a virus gets installed here, it may be difficult to remove it and it can cause a lot of damage. To see who has logged in, use the last command. This command may produce a long list of entries, so it may be better to pipe the output into less.
last | less
This command above will also show when the system is powered off. This can be helpful when figuring out when the system was last powered off. If the system loses power from the power supply, that will not be seen in this log. The number one part of security that should not be neglected is anti-virus software. A popular open-source scanner is ClamAV. Beware though, anti-virus software can use up a lot of memory. While scanning, they can consume a portion of the CPU resources. Be sure to allow the virus scanner to scan the system and get definition updates after the company’s closing time or during maintenance time. A script can be made to update the definitions and scan the system. Before leaving for the day, execute a script with contents like in Listing 1 below (remember to use root privileges). Techs may want to read the man pages for ClamAV and add the parameters that will best suit the system’s needs (Listing 1).

Backups
Sometimes, the server may crash or lose data no matter how well it is maintained. To prevent permanent loss of data, the storage units should be backed up. Administrators will need extra storage devices. If the server has ten terabytes of data, then the backup storage should be the same amount or more. With the stability of USB devices and FreeBSD’s excellent support for such hardware, this makes backups easier. Plug in a USB external hard-drive. Use your preferred backup utility. Once finished, unmount and unplug the backup USB device and store it in a secure, dry, safe place. It may be best to store the drive in a fire-proof safe. Then, if the building or storage room catches on fire, the company will still have the data from the last backup. Clonezilla is a live Linux disc that can be used like Ghost to make an image of the hard-drive to an external hard-drive.

For some, it may be best to only back up /etc/, /home/, /root/, and any other folders that may store important files that are needed that cannot be recovered through a fresh install. Keep a list or a storage device with all of the software installed after the original/last fresh install. Also, keep a copy of the installation disc of the preferred BSD operating system. Then, if the system must be reinstalled, the tech can install the BSD distro, install the applications, and then put the data and files back. Remember, when doing a fresh install, to reformat the hard-drive(s). The filesystem may not have been formatted in a long time and the system crash may have been caused by, or caused, corruption of the filesystem. If the system is completely ruined, use the “rescue mode” that is on most BSD installation discs. This recovery utility may help save the system and data. When the server boots from the disc, read the menu and press the button needed to initiate rescue mode.
Always make a second separate backup of the company’s databases and data. If the company’s data gets ruined from a hacker or for whatever reason, this backup will be helpful. Saving the company’s data may be more important than the server itself, so keep this data safe and make back-ups often. If the system is fast enough, make a script that will copy the data to an external or remote storage device during a lunch break or some other large break.

Important Rules
Here are some very important rules to follow when managing a server.

KISS
Keep It Simple Stupid. Do not write overly complicated scripts or use long strings of commands when a shorter or more simple script/command can be used. Making scripts/commands more complicated than they really should be can increase the chances that a coding mistake will be overlooked and do something to ruin the BSD system, especially if the script is executed in Root’s account. As another example, if a system needs to be updated only for the purpose of getting a new SCSI driver, then download and install that single driver (if possible) instead of updating the whole system. This rule also applies to deciding whether to install an application. If it is not needed, do not install it. Some installed programs can slow down the boot-up time.

NOTE: The above tip does not mean performing a task the lazy way. Perform tasks completely and correctly – just do not overdo it.

Do not use the name Root in vain – If a task can be completed on the server (or a workstation) using a regular account, do so. Only use Root when absolutely necessary, or else one accident can cause devastation to the server and network.

Know your BSD distros – Overall, BSD distributions are the same as far as file and application locations. However, some may be different or may not support some shell commands. Before running certain commands (like rm and mv), make sure the files and directories you plan to manipulate exist in the location you are normally familiar with. Performing an action in the wrong directory can cause confusion when the system does not perform the actions it is assumed to complete.
If it is not broken, do not fix it – If an upgrade is not needed, then do not do it. Updating, editing, and replacing hardware can cause issues. It would be a waste of resources to do so on a system that does not need it when the system later has issues from an unneeded “fix”. For illustration, assume Bill upgrades the BSD operating system without a specific need. Before, everything ran well but now, there is some system conflict with the new system. The conflict could have been avoided if Bill had not upgraded the system. When a system needs a repair or update, then the potential risks and problems that may result from upgrading and repairing will be more worthwhile.

Certification Prep
For those of you wanting to get your BSD and/or LPI certifications, you must understand all of the server daemons and configuration files. You must thoroughly understand the great importance of system backups and how they are performed. Learning the location of important system and server logs will also help those wanting to earn such certifications. Obtaining these certifications will help admins get better jobs. Studying for such certifications will also give admins the knowledge they must have to sufficiently manage servers.

About the Author
Devyn Collier Johnson DevynCJohnson@Gmail.com has written many articles for Linux.org, wrote one article for the Full Circle Magazine on Clementine, and was the technical editor for McGraw Hill’s book Epub: From the Ground Up. The author has some experience and certifications in Linux/Unix systems.

This article was re-published with the permission of BSD Magazine. To Learn More about iXsystem’s commitment to open source check us out here: https://www.ixsystems.com/about-ix/

BSD Server Maintenance

September 30, 2013

BSD Magazine Article

Join iX Newsletter