FreeBSD Programming Primer – Part 1

}

January 29, 2013

o

BSD Magazine Article

In this new series we will look at the tools, processes and methods involved in writing software, including developing a Content Management System (CMS) which will run under an AMP stack on FreeBSD, OpenBSD, Linux etc.

What you will learn
How to to configure a development environment and write HTML, CSS, PHP and SQL code
What you should know
BSD and general PC administration skills

Within the I.T. environment there are many disciplines, and often these skill sets work in isolation. The sys-admin doesn’t always understand the challenges faced by the programmer or developer, the support engineer doesn’t understand the problems of the developer, and the project manager doesn’t understand the problems of the technical staff. In this new series, we will examine from first principles how to develop a CMS that will run on any Apache / MySQL / PHP stack. This will involve writing HTML, CSS, PHP and SQL code.
Code is Everywhere
To the uninitiated, writing computer code from scratch may seem a challenge. Certainly, some programming languages are more complex than others, but the fact remains you have already programmed some device at some stage without realizing it even if you have not been near the command line (for example a VHS recorder, central heating timer etc.). As a result you have instructed the device to do something (Record the Simpsons at 10:00PM on Friday evenings). Software is effectively just a collection of instructions, logic and actions like this that allow the computer to interact with another computer, an end user or just itself. The skill is in writing good code that meets the following guiding principles:
1. Does “what it says on the tin”
2. Is user friendly
3. Is secure and reliable under stress
4. Is fast and efficient (Don’t Repeat Yourself)
5. Is easily modified and extended
6. Can be easily understood
7. Has documentation
While some of these points are essential to any piece of software, some may be more important than others depending on the operating environment and specification. For instance, a piece of code that pulls pages from a website on a daily basis into a a new directory in the format day_month_year (like 01_01_2013, 02_01_2013 etc.) for later reading by a technician would not necessarily require anything other than a log file entry saying “404 Not Found” if no content was available. However, if this was a critical program designed for an end user, it would be better practice to raise a friendly error message e.g. “The page you requested was not found. Please try again later or contact the helpdesk on 123 456789”.
Software writing should be creative and enjoyable, and part of the challenge is to have a reasonable idea of what you want to achieve beforehand, who your audience is, what limitations you must consider, and the environment the software will run under. A good functional specification should cover these details, but it is important to realize that software is never really finished. More functionality may be required, the environment may change, or bugs and faults need to be rectified in the program. That is why code should be easily modified and understood as it is the programmers worst nightmare having to maintain a badly written, undocumented, broken program. Trying to get inside someone else’s logic especially when under pressure to meet deadlines can be very stressful!
Computers Are Not Very Clever
The old adage “Garbage In = Garbage Out” is most applicable in the area of programming. As CANVC, they can only literally interpret any instructions that they receive. For instance, you might think you have asked the program to print the date, but due to an error in your logic, it might return 01-01-1970, NULL, or UNDEF. It might not even return anything at all. Sometimes when writing code you will be convinced the computer is your enemy. This is where defensive programming and debugging come to the fore, by re-thinking the obvious (and not so obvious) assumptions such as “All input data is valid”. The defensive programmer would respond by saying “All data is important and tainted unless proved otherwise”. Expect the unexpected. Sometimes it is best to walk away, take a break and return to the problem later. Late night coding sessions can be frustrating, especially if the result is not what is expected. Trying to debug an issue without a decent IDE (Integrated Development Environment) is possible, but time consuming.
Choosing the Language
Not all programming languages are equal, and some are less equal than others. Different languages are geared towards different tasks.
Shell programming languages (for example Bash, Sh etc.) are great for system administration tasks e.g. clearing out and archiving directories, running commands depending on the user response etc. However they are not fully fledged programming languages as such.
BASIC and Pascal are great for learning how to code, but they have some limitations. While it would be possible to write a CMS in either of them, as they are not primarily geared towards the web the program would be complex and convoluted.
The same argument applies to C. C is extremely powerful and flexible and PHP, Apache and MySQL are written using it. It would be complete overkill to write the CMS in scratch from C as we would effectively have to re-invent the wheel.
Java would make a great platform for a CMS due in part to its extensive library support and security, but as it is object orientated rather than procedural, the code and underlying principles would be more complex.
Script based languages (for example Ruby, Perl, Python, PHP) are geared towards the Internet, and most ISP’s will support them. As PHP has good support, is very portable, the documentation is excellent, and integrates well with both Apache (Our web server software) and MySQL (our database) it is a strong choice. While the the other script languages are just as suitable for our CMS, the author has more experience with PHP so that is the reason for the choice.
SQL, HTML and CSS are different types of language. While not considered “real” programming languages as such (on their own you could not write a software application) they are essential to our CMS.
SQL (Sequential Query Language) is the de facto standard language of databases. While most databases today use some form of SQL to extract, view and alter data, the “dialect” differs from database to database. We will use SQL to fetch our dynamic content from our database.
HTML (Hyper Text Markup Language) is the language of the web page. Each document has separate elements e.g. a body, header, images etc. and the HTML standard defines what these elements are. HTML pages are served by Apache and interpreted by the client browser e.g. Firefox.
CSS (Cascading Style sheets) are used in conjunction with HTML to change the style of the raw HTML pages. While it would be possible to write a CMS without it, it would probably not be very aesthetic.
JavaScript is a lightweight programming language used for dynamic tasks in conjunction with HTML e.g. changing content on the fly. It is run seamlessly from the client browser.
Generally, programming languages fall into 2 categories, complied and interpreted. For instance C, Basic and Pascal are compiled whereas most script languages are interpreted. The major difference between compiled and interpreted languages is how the program itself is accessed and run. In the compiled scenario, the initial source code is passed though a compiler which generates a stand-alone binary if the source code is valid. The operating system then handles the corresponding output. A binary compiled for one particular Operating System will not run on another – in general the compiler has to match the O/S unless some form of emulation and library support is available. With interpreted languages each line of the source code is passed through the interpreter which handles the corresponding output. Both language types support additional libraries which extend the core functionality of the language (e.g. graph support) and these are used as required. See Figure 1 and Figure 2 – Compiled and Interpreted languages.
The bottom line is that you need to choose your language for the task you have in hand. Some all purpose languages are great but you need to remember the limitations. The author often uses PHP for add-hoc scripts, but Perl or Bash would be just as effective. Often it is a case of what you feel most comfortable with, but at the same time you don’t want to fall into the trap “When the only tool you have is a hammer every problem is a nail”.
To err is Human
Writing code is paradoxically both infinitely creative and flexible yet structured and pedantic. One missed semicolon, a full stop in the wrong place, even word case can be the difference between a working code segment and an esoteric error message. Sometimes by fixing one problem other problems are introduced, sometimes the real problem was never addressed at all. It is important that we are able to snapshot and document our changes as well as quickly isolate any problems. As part of the series we will look at version control and debugging.
The Draft Specification
The initial specification of our CMS is per table 1. Further additions may be made over the series to demonstrate specific principles. The inspiration for parts of the specification came from the excellent CMS, Drupal by Dries Buytaert.
Testing
It is critical that any application is properly tested before release. While automated testing methods are available, for the purpose of this series will limit testing to some crude load and security testing and ensuring that the program “just works as advertised”.
The Development Environment
In a commercial environment, the bare minimum would probably consist of a test (development) server, a live (production) server, a Version Control Server (VCS), possibly a database server (MySQL) and the developers workstation with an Integrated Development Environment (IDE) for code development, syntax checking and debugging. Source code would be pulled from the VCS, edited and tested on either the workstation or the development server, committed to VCS and pushed to the production server for access by the users when stable and ready for release. This scenario is too complex for our series, but while it is possible to develop just from the command line, debugging (and certainly testing) will be close to impossible outside of a graphical environment. As a very bare minimum, you will need a headless FreeBSD box (without any GUI) and some sort of workstation with Firefox installed, but ideally your BSD development box should support Firefox, Netbeans, Apache. PHP, GIT and MySQL. Your favorite CLI editor can of course still be used for editing.
In the Next Article
We will start programming in earnest and start serving our first CMS page.
Bio
Rob Somerville has been passionate about technology since his early teens. A keen advocate of open systems since the mid eighties, he has worked in many corporate sectors including finance, automotive, airlines, government and media in a variety of roles from technical support, system administrator, developer, systems integrator and IT manager. He has moved on from CP/M and nixie tubes but keeps a soldering iron handy just in case.

FreeBSD Programming Primer – Part 1

January 29, 2013

BSD Magazine Article

Join iX Newsletter