Linus Torvalds Created the Linux kernel while at Helsinki University (Finland) Released September 16, 1991 | Linux Basics | Ian Murdock Created Debian while at Purdue University (Indiana) Released August 16, 1993 |
While many may shy away from Linux because of its complexity, it is this very complexity that makes it so interesting and beneficial. And as with anything complex, when taken as a series of small, simpler pieces (as we do on our guide pages) it becomes easy. With all of its pieces, Linux is like a bottomless toy chest that will provide you with many years of learning. "Never stop learning" as they say. Your brain needs exercise just as much as your body. Keep it in shape or you run the risk of becoming a mental turnip by the time you're 70. Linux is a great brain exerciser.
Back before Microsoft developed Windows, Macintosh computers were more popular. It was easier for new computer users to use a mouse to point to cute little pictures than to have to learn a bunch of DOS commands. However, you couldn't write batch files on Macs, couldn't redirect text or file contents to ports, pipe input to commands, take actions based on return codes, etc., etc. While the Mac GUI made it easier to use a computer, it insulated you from the hardware and OS kernel limiting your ability to execute commands and automate tasks. And isn't automation, i.e. having the computer do the work for you, what computers were supposed to be all about? The Mac GUI did quite the opposite. It required user input to accomplish anything. A similar comparison can now be made between Windows and Linux/UNIX servers and the same equations hold true:
Simplicity = Limitations
Complexity = Capabilities This is the case with just about anything. An audio system with a single "tone" control is easy to use but it doesn't give you the options for sound quality that one with a 15-band graphic equalizer does. While it may be easier to learn how to use a Windows server OS, you pay for it by being forced to manually supply inputs and by being restricted in your ability to automate. The real downside of this is that automation (having computers perform tasks instead of people) is what provides the greatest productivity gains, and gains in productivity can lower costs and increase an organization's competitive advantage.
While it may seem unbelievable that having an old Pentium system and $20 means you can have your own Linux Internet, LAN, gateway, or application server, our guide pages will show you how easily it can be done. The $20 is for a 5-DVD set containing the official Debian Linux distribution which is sold by Web vendors (now including
us!). The $20 is just to cover media, duplication, and labeling costs. The Debian Linux software itself is absolutely free and you can set up as many systems as you want with no licensing concerns once you get the DVDs. You can download DVD or CD images directly from one of Debian's mirrors and burn your own. However, considering that you'd be downloading over 20 giga
bytes of data, even with full use of a 1.5 mega
bit/sec T1 line it would take a
long time. When you consider the cost of the blanks and the time it would take to get an uncorrupted download of, and then burn, all the images you'd have to have a lot of free time on your hands to make downloading worthwhile. Given all the different types of servers you can set up (see the bullet list in the next section), a $20 set of DVDs is a bargain investment in your education.
Linux will run on many different hardware platforms and Debian supports the widest variety with each platform having it's own disc set. That's why you'll see Debian DVD and CD sets offered with notations like 'i386' for Intel PCs, 'PPC' (PowerPC) for older Macs, 'Sparc' for Sun systems, and even an 's390' set for IBM mainframes. There is also a 'Source' disc set which contains the source code files for the entire OS and all of the applications and utilities that come with it. This would be of interest to you if you are a C programmer (or want to learn C programming) and are interested in viewing or modifying the source code of the OS, utilities, and/or applications included with Debian. Note that if you want to install Debian on a standard Intel/AMD/Cyrix PC you'll want the 'i386' disc set.
When you say the word "server" most people think you're talking about powerful, expensive systems with RAID drives and dual processors. Nothing could be further from the truth. Any old PC can be a server. It's actually the software you run on it that determines if a PC is a server or a "workstation". And thanks to the modest hardware requirements of Linux, you don't need much of a PC in order to set up a server. Old PCs can be given new life as Linux servers. Every production Linux server we have set up have all been on P-III systems and they are all running great. And if you do have older server-class hardware you'll be very impressed with Debian's performance on those systems that you thought could be of no use to anyone. Whether old Dell, HP, IBM, or Compaq servers, I have yet to encounter a system where Debian did not accurately detect the RAID controller and other hardware.
Why OLD Is GOOD PII and older PIII systems make great systems for learning Linux for the following reasons: - Debian's performance as a server (read "No GUI") is suprisingly good on old hardware.
- It's the "green" thing to do as it will keep these old systems out of a landfill.
- They cost next to nothing. I created the Lenny installation page while installing it on a PII-233 with 128 megs of RAM and a 1.6 Gig hard-drive, a system that may run you about $10 if you can find one (ask around and you may be able to get one for free).
- The main reason - they're SLOW. Slow is good because it gives you time to see screen messages. Boot up Debian on a P4 2.something Ghz system and all of the screen messages will be a blur. Reading boot-up screen messages is not only educational, because it tells you what the OS is doing, but also alerts you to any potential problems. Slower scrolling of screen text is also helpful when executing some console applications.
|
The best way to play around with a Linux server is to pick up an old Intel P-II or P-III system without a monitor and keyboard from a swap meet or sites like eBay or Craigs List. Then also get a 2-port KVM (Keyboard/Video/Mouse) switch so you can use the monitor and keyboard from you current PC for both systems. Setting up Linux on a separate system doesn't cost much and it's a lot safer because you won't hose up your main Windows system trying a multi-partition dual-boot scenario where both Windows and Linux are installed on the same system. (Besides, you'll want to use your Windows system to access your Debian server to test it out which you can't do with a dual-boot configuration. We show you how to network two systems using two NICs and a single crossover cable on the
Networking page.) Getting an older Intel system is helpful as well as inexpensive because you're less likely to run into issues with chipset support and drivers that you can encounter with newer hardware. You can pick up a used P-III system (without monitor and keyboard) on eBay for well under $50 these days. (I picked up a Dell Optiplex GX1 P-III with 128 meg of RAM and an 8-gig hard-drive for $29 on eBay.) Then pick up a
$40 2-port PS/2 KVM including cables on Amazon. Just make sure the older system has a network interface (NIC), and CD or DVD-ROM drive (there's no need for a sound card on a server). A 4-gig hard-drive and 128 meg of RAM is plenty (setting up a Debian server with all of the software we install on these guide pages takes up less than a gig of hard-drive space). A clone system would be better than a name-brand one but those are not as easy to come by. If your systems don't have a NIC (Network Interface Card) you can pick up 3Com 3C905 (PCI) NICs on ebay for under $10. A DVD-ROM drive is preferred over a CD drive since more and more software is coming on DVDs these days. There's no need to get a DVD burner. I got a Sony DVD-ROM drive on eBay for $15. That, along with my $29 Optiplex, put my total server investment at $44.
Linux is the name operating system*. However, unlike Windows it is available from many different companies. These companies may add their own bells and whistles to the operating system (like a graphical install routine), but they all use a version of the Linux "kernel" (i.e. guts of the OS). Linux releases from different companies are called "distributions" (aka "distros"). The Red Hat distribution is the most popular commercial distro with Suse and Mandrake being two others. Commercial distros are produced by companies which seek to make a profit on selling and supporting their distributions of Linux. (If see a distro simply referred to as "Linux", for example "Linux 9", it's Red Hat.) Debian (pronounced deb-ee-en) is a little different. It's the world's leading non-commercial distribution produced by volunteer developers world-wide seeking to promote the concept of free and open software upon which Linux was initially created.
* Technically, while the term "Linux" is commonly used to refer to the operating system, Linux is actually just the kernel piece. The rest of the operating system (command-line utilities, etc.) is typically from the GNU free software project. That's why Debian is officially referred to as "Debian GNU/Linux".
Commerical distributions become proprietary when they replace some of the commonly-used GNU pieces of the operating system with their own. It's at this point where you start to get distribution-specific problems and requirements for upgrades/support.
Red Hat is notorious for replacing many of the standard GNU UNIX-like commands with non-standard, proprietary commands of their own. As a result, many of the freely-available general Linux books and resources on the Web cannot be used when working with a Red Hat system. Even books and resources that cover earlier versions of Red Hat are difficult to use because more commands are changed with each new version. In a few years Red Hat Linux may not look like "Linux" at all.
Distributions will also differ in the /locations and names of configuration files. For example, the files that contain network interface (NIC) configuration information are as follows:
Debian | - | /etc/network/interfaces |
Red Hat | - | /etc/sysconfig/network-scripts/ifcfg-eth0 (A separate file for each interface) |
Suse | - | For versions >= 8.0 /etc/sysconfig/network/ifcfg-eth0 (A separate file for each interface) For versions < 8.0 /etc/rc.config |
The Debian distribution was created in 1993 by Ian Murdock while a computer science student at Purdue University. He wanted a Linux distribution that was maintained in a free and open manner adhering to the original intent of Linux and GNU software. (The Debian name comes from combining his name with that of his now-wife Debra.) In addition to developing the initial software, he wrote the Debian Manifesto which outlined his vision for a free and open Linux distribution.
Debian's GNU/Linux pedigree and adherence to standards makes it the distro of choice for many including being chosen for a Space Shuttle mission back in 1997. The current Debian distribution (version 5.0, code-name 'Lenny') includes over 25,000 software packages which are also totally free (which is why it comes on 31 CDs or 5 DVDs). Desktop applications, server applications, utilities, developer tools, and more can be added to your system with a single command.
Schools and Libraries Libraries and educational institutions at all levels with computer labs, and even some businesses, can benefit from the work done by the K-12 Linux Terminal Server Project. Older PCs with no hard-drives can be used as "X-Terminals" in the lab while desktop and applications (such as Open Office and Mozilla Web Browser) management are handled at a central X server. Businesses with more comprehensive applications requirements can use Debian's Linux Terminal Server Project package to set up a terminal server and you can have an office full of disk-less workstations for a $20 software investment. Because the end-user systems don't have hard-drives:
- There are fewer hardware failures to deal with
- The workstations consume less power running cooler and quieter
- Users can't modify or hack the workstation configurations or software reducing support requirements
In addition, SkoleLinux is a Linux distribution developed specifically for schools by the Debian Edu project. It is being adopted state-wide in the German federal state of Rhineland-Palatinate. |
As a non-commercial distribution, Debian doesn't have to crank out new versions to generate revenues which is why the current version number is much lower than for other distros. Some occasionally criticize Debian for this but you can bet a years salary they're not network administrators (likely those who use Debian for their desktop OS). Network admins don't like upgrading or replacing servers (which is evident by the fact that Microsoft had to back off their initial plans to stop supporting NT because so many Windows servers out there are still running it). "If it works, don't fix it." More than anything, network admins want stable, reliable servers that simply sit there and do their job year after year requiring little, if any, attention. If you don't like babysitting servers you'll love Debian. Its reputation as a rock solid OS is due, in part, because they're not rushing to crank out new versions.
Microsoft sees Linux as the single biggest threat to its business for one simple reason. Since no one owns Linux, it's not something they can just buy up in order to destroy (a tactic Microsoft commonly employs to get rid of its competition as was revealed in the DOJ anti-trust hearings). Rather than deal honestly with genuine competition they choose to bash Linux. Ironically, the fact that Microsoft feels it's necessary to take the low road against Linux only helps to substantiate it as a serious operating system capable of providing stable, scalable, secure servers for any size enterprise.
Linux is becoming mainstream in its use as a server operating system. According to the Gartner Group, major server vendors (HP, IBM, Dell, Sun) reported that while overall commercial server sales for all platforms dropped 8% from 2001 to 2002, their Linux server sales increased by 63%. Networking stalwart Novell bought Suse to get into the Linux desktop arena. In addition, Novell is developing editions of its Groupwise and Zenworks products for Linux desktops and IBM already has a Linux version of it's Notes/Domino corporate e-mail package.
Another reason you'll want to learn Linux is because of the rapidly growing popularity of virtualization (creating multiple virtual servers on a single physical system). The giant in the virtualization arena is the ESX software package from VMware. You install the ESX software on a system the same way you would install any Linux distribution on a system. Then you use ESX utilities to create virtual machines and install a "guest" OS (Linux, UNIX, Windows) onto those virutal systems. ESX is based on Linux so the more you know about Linux the better you'll be able to work with ESX (ask my co-workers who were envious of my Linux skills when we started using ESX). There are a lot of ESX tools that are only available at the command line.
With heavyweights like HP, Novell, IBM, and Dell all behind the growth in Linux, its popularity will only increase and those who support server and desktop systems would be well-advised to learn it. In a November, 2004 article ComputerWorld said that "Linux use is growing faster than the talent pool needed to support it." and that "Skilled Linux administrators in major metropolitan markets command 20% to 30% salary premiums over their Unix and Windows counterparts."
Using Linux | |
Linux can be used to set up any number of server-type systems as well as workstations. This site is primarily concerned with the server aspects of Linux. You can use your Debian Linux software to set up the following types of systems:
- Web servers for external (Internet) or internal (Intranet) use. (We show you how on the Internet Servers page.)
- Mail servers to handle both internal and Internet e-mail. (We show you how on the Internet Servers page.)
- Other Internet-type application servers such as FTP, news, IRC (chat), etc.
- Web cam servers to keep an eye on your home or business operations from a remote location. (We show you how on the Web Cam Server page.)
- Proxy/NAT servers that allow all the systems on a network to share a single broadband Internet connection at home or the office. (We show you how on the Proxy/NAT page.)
- Packet-filtering firewalls which allow you to control what traffic goes out of and comes in to your network (while also performing the proxy/NAT function). (We show you how on the Firewall page.)
- Internal LAN servers for file and print sharing much like Novell or NT/2000. There's even a Linux software package available called Samba that makes a Linux server appear as an NT server to Windows workstations. (We show you how on the LAN Servers page.)
- DNS servers to resolve Internet and/or internal LAN host/domain names. (We show you how on the DNS page.)
- Database servers running MaxDB - formerly SAPDB (free), MySQL (free), or Oracle ($$$$) database software. (We show you how on the Database Server page.)
- Fax servers running HylaFax and utilizing old fax-modems allow all users on your network to send faxes from their desktops rather than printing out a hard-copy to stuff in a fax machine. (We show you how on the Fax Server page.)
- LAN and WAN routers which offer an inexpensive alternative to those $5,000 Cisco boxes.
- Syslog servers which allow you to centralize the monitoring of your network and systems operations. (We show you how on the Syslog Server page.)
- IDS (Intrusion Detection Systems) to monitor your Internet address space for hacking and attack activity. (We show you how on the Snort page.)
Given the free nature of the Linux software and its modest hardware requirements, small and non-profit businesses, schools, libraries, etc. can have all of the computing capabilities and Internet services of big, for-profit corporations with very little financial investment. And Linux is not just for the little guy. Big businesses can save big dollars with Linux because they don't have to pay for all those expensive client access or "seat" licenses (see the server comparison diagram below).
The other benefit to the modest hardware requirements of Linux is that if you do have a fairly powerful machine, you can run numerous applications (such as Web and e-mail and FTP and Telnet and DNS) all on one system reducing your overall hardware requirements. (While it is certainly possible for a single server to handle both internal LAN and external Internet functions, it isn't wise to put both functions on one server for security reasons.)
Support options for Linux-based systems are also growing. Commercial server vendors HP, IBM, and Dell now offer servers pre-loaded with Linux and provide numerous support options for them. Commercial distro vendors have various support packages available and third-party companies offer distribution-specific support options ranging from per-incident to 24/7 contract coverage. For individuals and small businesses, there are free self-help and peer-support options such as on-line documentation, newsgroups, listserves, and chat rooms. We show you how to use one of Debian's chat rooms on the Compiling Software page and Debian support resources are listed on the Resources page.
If you're looking for a career, there are two different categories of jobs working with Linux/UNIX servers, but they can often overlap. You can focus on a career as a network administrator, where you primarily take care of all of the types of systems mentioned above, manage user accounts, access rights to files, etc. The other is as a programmer, where you are writing shell scripts or programs which can be written in a wide variety of languages, with C being the most widely used. These scripts and programs are often used in the middle or "back-end" tiers of "multi-tier" client/server systems to automate things. For instance, Linux/UNIX servers are often used as back-end database servers running Oracle. In large organizations these two aspects are usually segregated with different job titles. In smaller organizations you may end up doing both, which would be the best training you could ask for. Note that a network administrator will find their life much easier if they are a good shell script programmer. The better they are at writing shell scripts the more they can automate administrative tasks on the servers. As more and more businesses learn about the potential for productivity gains and substantial cost savings realized through the reduced licensing costs associated with Linux, those with Linux knowledge will be in greater demand.
That's not to say you have to be into networking or C programming to have any use for Linux. A vanilla installation of most Linux distributions will include the installation and setup of the Apache Web server software. Out of the box a Linux system can act as a test Web server for Web site developers and those who write CGI scripts for Web sites (which you know the value of if you've ever taken down a production Web server hosting 200+ sites with a looping CGI script).
Linux can be useful at home too. It's easy to use it to set up a firewalling proxy server to share a broadband Internet connection with the all of the computers on a home network. (We show you how on the Networking page.) And as long as you've got a Linux proxy box hanging on the Internet, it's just as easy to have your own home Web server (we show you how on the Internet Servers page).
Normally, if you want to set up a e-mail or Web server you have to have a fixed ("static") IP address assigned by your ISP and your own domain name. However, dyndns.org offers a free service called "dynamic DNS" which will allow you to set up your own home Web and e-mail server on a system where the IP address changes (as happens with dial-up, and residential DSL and cable-modem services). You don't even need your own domain name! If you did register your family's name as a domain name you can use dynamic DNS and set up a Sendmail server to receive e-mail for the domain name (ex: homer@simpson.com). Family members would then set their POP3 clients to retreive their mail from this Sendmail server rather than the ISP's. In addition, you can run the Apache Web server software on the system also and host your own family Web site. Information on using dynamic DNS services is given on the DNS page and setting up a Web/e-mail server using the Apache and Sendmail software is given on the Internet Servers page.
Kinda Like DOS | |
Linux is an OS with a character-based interface like DOS. DOS has a character-based interface and it is the command interpreter in the COMMAND.COM file. When you open a DOS window in Windows you are running a character-based command interpreter similar to DOS' COMMAND.COM interpreter (the CMD.EXE file). It is this interpreter that gives you the C:\> prompt when you open a DOS window or boot a DOS system. (Now you see why they call it an "interpreter". It interprets the commands you type in at the prompt.)
While DOS only has one character-based interface, Linux (and UNIX) have several that you can choose from. Instead of "interpreters" they are called "shells" (but they are still interpreters). UNIX has three standard shells; C, Korn, and Bourne.
Linux has it's own versions of these three popular UNIX shells plus a few of it's own. One is called "Bash", for Bourne-Again Shell, and it is the default shell for most Linux distributions because it combines most of the features of the Bourne and Korn shells.
The Linux/UNIX shells have their own prompts. When you log into a Linux system you'll see either % or $ depending on which shell you choose to use. There's also a third prompt which is the # if you log in as "root". "root" is the super-user account in Linux/UNIX, similar to "administrator" with Windows or "supervisor" with Novell.
Just as you would enter commands like dir and copy at a DOS prompt, you enter commands like ls and cp at a Linux shell prompt.
And just as Windows 3.1 provided a GUI interface to DOS-based systems, Linux also has several GUI interfaces available. The most widely-used GUI is Gnome. KDE is another popular GUI. But since it doesn't make a lot of sense to have two different GUIs on one system, you usually just install one or the other. When you go looking on the Internet for Linux software you'll often see programs with names that start with a G or a K (like Gpad) which indicates that they are programs that will only work with those specific GUIs.
You will also often see GUI program names start with an X or referred to as "X11", "X windows", or just "X" programs. That's because the GUI on Linux/UNIX is a little more sophisticated. A piece of software called an "X-server" actually generates the graphics, and a different piece, called a "desktop manager" (like Gnome or KDE) manages the display of the graphics. This is done so that a central server can generate the graphics while individual workstations can display them the way they want by customizing their desktop manager settings. (Linux/UNIX was into "thin clients" long before it became fashionable in the Windows world.) On a single Linux PC with a GUI installed, the X-server piece and the desktop piece just run on the same machine. (Programs that are not written for a GUI, i.e. are written for the character-based shell interface, are referred to as "console" programs.)
Drawing on the Windows comparisons a little more, you may be familiar with Windows 2000. There are two versions of Windows 2000, Server and Professional (Workstation). With Linux there is only one version, and a Linux system can be either a server, or a workstation, or both simultaneously. You decide if the system is a server or a workstation simply by the services and applications you run on it. The routine on the Installation page will install both server and workstation applications. By following this installation routine, you'll end up with a Linux system similar to one in the following diagram. (There's now a free and open version of Sun's Star Office product called Open Office and some kind folks have created a Debian package of it. See the Resources page for a link to them.)
When compared to a common Windows PC the main difference is that the GUI is integrated into the operating system with versions of Windows after 3.x. As you can see, conceptually they are the same. It's just that the software (both OS and applications) that is run on the system are different, and with Linux the GUI is run like an optional application (it's not forced on you by the OS). Be aware that the items listed in the "Application" layer are OS-specific. That is, you can't run Windows applications on a Linux system and you can't run Linux applications on a Windows system. Some larger "name brand" applications are available in different, platform-specific versions. For example, the Adobe Acrobat Reader has versions available for Windows, Linux, and Macintosh.
The real differences between Linux and Windows can be seen in the server area. While Windows 2000 Server would crawl on a Pentium-II with 64 meg of RAM, this same hardware would make a respectable Linux server. The biggest difference, however, is in the software and licensing costs. While the Windows server software does include the IIS Web server software, the server software will cost you $1,000. And that's only for anonymous access to the Web pages hosted on the server. If you plan to have any Web pages that people log in to, you'll need to get an "Internet Connector License" for an additional $2,000. The Exchange e-mail server software only costs $680 but it'll be a mail server no one can access. For that you need CALs (Client Access Licenses). You not only need Exchange CALs (around $80 per user) so people can use the Exchange application, but because the Exchange application is hosted on a Windows server you'll need Windows Server CALs (which are around another $30 per user) so they can access the application. Here is how the costs compare for 100 users with a combination Web/Mail server:
And this doesn't even get into the annual costs associated with Microsoft's "Software Assurance" program. These costs are just to get things set up initially. The above prices were taken from the CDW Web site (www.cdw.com) for the Windows 2000 and Exchange 2000 products. If your organization has close to 500 users the additional Exchange and Server CALs raise the cost to $58,680. You can verify the need for the above connector and CALs by calling Microsoft at 1-800-RU-LEGIT and select the options for pre-sales licensing. And that's just for a Web/E-mail server. Setting up database server using SQL Server also involves application CALs so the cost difference between Linux and Windows for a mid-sized organization would be well into six figures for two servers.
If you're planning on hiring a consulting firm for a new system implementation, ask them if they offer Linux and UNIX solutions. If they don't, you're only going to get Microsoft products suggested to you, which may be better for the consulting firm because they get a piece of the action, but you'll get anything but the most cost-effective solution for your needs. If only Microsoft solutions are proposed ask them why, given the potential cost savings for you - their client - they didn't offer any Linux or UNIX solutions, particularly in the Internet server area. Be suspicious if they infer that Linux isn't mature or stable. Linux servers are sold by the likes of IBM, HP, Dell and others who sell to Fortune 500 customers. While there may very well be areas where a Windows solution is appropriate, such as application requirements or the necessity to interface with a legacy Windows system, any firm that bases their proposals on a "one platform fits all" attitude doesn't have your best interests at heart.
Web and E-mail servers are perhaps the easiest place to save big money by going with Linux. While many desktops have the Microsoft Office Suite installed, Outlook and Outlook Express don't care if they're pulling messages from an IMAP server as with Exchange on Windows or a POP server as with Sendmail on Linux. And the release of Samba 3.0 (see the LAN Servers page) can be used to add a Linux e-mail server to W2K's Active Directory to handle user authentication. As far as Web servers go, IE doesn't care if it's pulling pages off of a Linux Apache or Windows IIS server and Chilisoft (see the Internet Servers page) will allow you to run ASP pages on Apache.
An area where organizations could save substantial dollars using Linux is with database servers because database (Oracle and MS SQL Server) user or seat licenses are typically the most expensive. In most instances, an ODBC connector sits between the database server and the applications running on the client workstations. The beauty of replacing a database server with one running Linux and a free database product is that you simply use a different ODBC connector (the myODBC connector in the case of MySQL) on the clients. If a database server is acting as a back-end to a front-end browser-based application, simply change the ODBC connector on the Web server. No client changes are needed at all. With SAP AG releasing their back-end database product as open source (MaxDB) and MySQL gaining enterprise-level features with each new release, there's simply no reason to bear the cost of an Oracle or SQL Server back end server. We'll show you how to set up a MySQL database server and ODBC connector on the Database Server page. In addition, both MaxDB and MySQL have commercial support options available. (Links to the MaxDB and MySQL sites, as well as several good reference sites, are given on the Resources page.) The advantage MySQL has at this point is that it's included with the official Debian distributions as a .deb package so the installation is more convenient. As more and more businesses realize the potential cost savings of using the open source database products, those with experience with them will have an advantage.
We've seen comparisons between Linux and DOS and Linux and Windows, but Linux is very similar to UNIX. If your goal is to be a UNIX administrator, learning Linux will get you 90% of the way there. That's not an exaggeration. I took a UNIX class at our local community college (which used an IBM server running AIX - IBM's flavor of UNIX) and I didn't encounter anything in my assignments that I couldn't do on my Linux system. Linux even comes with a Korn shell, which was the shell we used in my UNIX class because of its enhanced scripting features. I simply set my Linux system to run the Korn shell by default and this allowed me to have the same "user interface" at home that I had on the UNIX system at school.
When playing around with the x86 (PC-based) version of Solaris (Sun Microsystem's flavor of UNIX) I purchased a book called "A Practical Guide To Solaris". 70% of the book covers commands that can be entered at a shell prompt on a Linux system! (If you're interested in using the x86 version of Solaris to learn that UNIX operating system, see our Trying Sun Solaris for x86 page.)
If you're a nerd at heart, believe me when I say you will LOVE Linux. It has so many features that it boggles the mind. It's an OS that you could play around with for five years and you'd still find new capabilities and functionality. To say it's like DOS on steroids would be an understatement. Its strong suit is the ability to automate operations due to its myriad of functions and strong scripting capabilities. Sign up for a beginner's Linux or UNIX class at your local community college and you'll see what I'm talking about. Even in a basic class you'll learn about a lot of the neat things it can do. The one down-side to Linux/UNIX is that it's not a user-friendly OS so there is a learning curve involved. Using one of Linux's GUI interfaces is helpful in this respect but to really learn this OS you'll want to use one of the character-based shells. Don't be discouraged if you find Linux confusing. Due to its myriad of commands and capabilities that's not uncommon. I found taking a UNIX basics class at my local community college to be VERY helpful. However, I also found it very helpful to do a little playing around and reading up on Linux before starting the UNIX class as it allowed me to better understand and appreciate what was being taught. If you've done any Perl programming for CGI scripts on a Website you will find that knowledge helpful also. Not only because you can use Perl to write shell scripts, but because the syntaxes of Perl statements are similar to Linux/UNIX OS commands.
Avoid the GUI !!! In order to truly learn Linux you have to learn to use its commands at a shell prompt and work with text configuration files. Many things are not available in a GUI, and the power of automation that Linux offers can only be fully utilized with shell scripts which, as mentioned above, are merely text files containing a series of commands. The GUI makes sense for things like Web browsing, but even in a GUI you should have a terminal window open so you can enter shell commands.
Another reason to avoid GUIs is that they eat up system resources. Running a GUI can use up to 32 meg of memory. If you're running multiple server applications on a system with limited RAM, firing up a GUI can slow these applications considerably.
There are also security considerations when running a GUI. A GUI should never be installed on a server. The X-server part of a GUI setup is, after all, a server. As such, it opens ports and uses them to "listen" for remote connections. Unnecessary open ports on an Internet-connected system provide another potential entry point for hackers. Because of this, all of the guides on this site only use the character (command line) interface.
Most ISPs and Web site hosting services use Linux or UNIX servers. One benefit of learning to use Linux/UNIX commands is that, if your ISP or Web site hosting service includes "shell access" with your account, you'll be able to telnet into your server and use commands at the shell prompt to perform tasks that simply can't be done using an ftp program or a Web interface.
Another key benefit is that if you know how to enter commands at the shell prompt, you'll know what commands to enter into shell scripts to automate tasks. The automation capabilities of shell scripts, when combined with a memory-resident scheduler like cron to run those scripts at regular intervals, will allow you to set up systems that do most of the work for you. On the Packages page we'll show you how to use cron and a shell script to automate the process of retreiving and applying the latest security patches for your system which will help protect Internet-connected servers from new worms and exploits.
Stuck In A Windows World ? A lot of times the hardest thing about learning to use Linux is getting to use Linux on a daily basis. Many organizations are entrenched in Windows or Novell platforms and opportunities to work with Linux simply don't exist.
If you're a network or systems administrator in one of these entrenched environments, one possible solution is to suggest setting up Linux on one or two older PCs to be used in two capacities: - As a network monitoring and troubleshooting tool
- As a security monitoring and testing tool (especially if you have Internet-connected systems)
The reason being is that, as you'll see on the Network Monitoring page, there are a ton of free network monitoring tools (ntop network traffic probe for example) and security utilities (the nmap port scanner for one) available for Linux, and bosses find it hard to argue with the word "free". On a LAN-connected system, running the Wireshark (formerly Ethereal) protocol analyzer can provide you with much of the same information as commerical sniffers costing thousand$ of dollar$ (take it from someone who has used both Wireshark on Linux and Fluke's Protocol Expert on Windows running on the same dual-boot notebook). In addition to using utilities to run security checks against your Internet-connected servers, a Linux system located in your DMZ could also run a free IDS (Intrusion Detection System) application like Snort full time. (We show you how to set up and test Snort on the Snort page.)
Most of the free utilities are available as Debian packages so installation is a snap and any that aren't can be compiled from the available source code. Two good books that detail available free utilities, as well as how to use them, for network monitoring and security testing respectively are: The "Maximum Linux Security" book will help you in setting up a secure DMZ-connected system. You wouldn't want your security monitoring system to itself become the victim of a hacker. Also, check out our Network Monitoring page ! |
Shell Scripting | |
For my money shell scripting is one of the most fun and interesting things to play around with because it is the key tool for automation. Due to of the dominance of GUI interfaces in recent years, which require you to manually supply inputs in the way of mouse clicks, etc. to execute most OS commands, utilities, and programs, the benefits of scripting are pretty much unique to the Linux/UNIX world.
Shell scripts are analogous to DOS batch files. That is, shell scripts are simply text files (created with any text editor) that contain a series of commands. These commands can be Linux OS commands, commands that run programs, commands that "call" other scripts, or any combination of these. That way you just execute the script every time you want to accomplish a task or process rather than typing in all the commands by hand every time. (If you're experienced with DOS, you may want to check out www.tldp.org/LDP/abs/html/dosbatch.html for a comparison of DOS-to-Linux batch file statements and shell commands.) As such, the various shells in Linux are not only a user interfaces but kind of like programming languages as well. The Korn shell is considered the best shell for programming on UNIX systems and the Linux Bash shell incorporates many of the Korn shell's functionality.
Anything you can type in at the shell prompt can be put in a shell script and there are additional scripting-specific commands for condition testing and control of the logical flow of a script. When used with other automation tools, someone who is good at writing shell scripts can accomplish some amazing things. These other tools include:
- cron - The cron scheduler can be used to schedule the routine execution of scripts at a given time or day. (We cover the use of cron on the Packages page.)
- Perl scripts - A shell script can call a Perl script. Given that Perl started out as a reporting language, it has extensive capabilities for working with both string and numerical data in text files. (Perl is are used heavily in CGI scripts on Web servers and is a very easy language to learn.)
- Regular expressions - Commands used to parse text strings (such as user inputs, e-mail messages, or the output of other commands, scripts, or programs) looking for matches and optionally performing substitutions.
- Redirection - Redirecting the input or output of a command, script, or program to an alternate device or process.
- Piping - Using the output of one command, script, or program as the input for another (i.e. "chaining" the execution of programs).
- Custom programs - Custom-coded compiled programs that can be executed from a shell prompt to perform tasks with proprietary data files or hardware.
When used in combination, these tools allow you to develop total end-to-end automation of business processes limited only by your imagination. It's like having a giant puzzle with thousands of different pieces (the commands and tools) that you can use to put together the solution you need. The really neat thing is you also have the ability to make your own custom pieces when needed using Perl scripts or custom programs.
Most Linux shell commands, as well as many utilities and programs written for Linux, have a number of command-line options (aka "switches") that allow you to customize the behavior of the command or utility. Some of these switches can be useful when the commands, utilities, and programs are used in an automated fashion. When combined with redirection and piping, a single line in a shell script can accomplish a lot of work.
Other programs have optional configuration files that can be created to enhance the automation capabilities of the command. For example, you can create a .netrc configuration file for the ftp shell command which contains login, server, and file location information as well as ftp program commands (get, put, lcd, etc.) allowing you to totally automate file transfers. By default the .netrc file is stored in the home directory of the user who creates it as a means of restricting read access to it because it can contain a clear text password. (We use the ftp shell command interactively on the Compiling Software page.)
Shell scripts which use the iptables OS command to turn your Linux system into a proxy server or firewall are given on the Proxy/NAT and Firewall pages respectively. However, these are relatively simple examples which perform only a few functions. (The comment lines in the scripts provide some information on the purpose of the commands.)
Try and get into an automation frame of mind. As you use your computer to do things, ask yourself if the steps you are performing could be automated. As you learn more about Linux and its commands, take note of any commands that would be beneficial to you in your automation needs so you can use them in a future shell script. Keep in mind that, because shell scripts usually contain some Linux OS commands, the better you know these commands the better scripts you'll be able to write.
Multi-User | |
Like other server operating systems, by default Linux and UNIX operate as "multi-user" operating systems. For example, if you put a Linux box on your network, multiple people can simultaneously use their networked Windows PCs to open up a telnet session to the Linux server. Each person would get their own terminal session (i.e. their own shell prompt with the ability to execute whatever shell commands they wish).
Not just anybody can do this. Only those that have an "account" on the system can access it. You create an account on a Linux system for someone by entering a login ID (aka "user name") and password for them. This is why the first thing you see when you boot up (or telnet into) a Linux or UNIX system is a login prompt. You have to let the operating system know which user desires access so it can put the appropriate restrictions in place. (For example, most user accounts can't modify or delete the operating system files.)
As mentioned earlier, when you log into a system using the user name root you can access/modify/delete anything and everything because root is the super-user account on Linux and UNIX systems. The root account is created automatically during the OS installation. When you install Debian, you are asked for a password for the root account and you are asked if you want to create any additional user accounts at that time. You can create accounts for other users after the installation also and you typically use the root account to do this. You also typically use this account to install software and edit the OS and application configuration files (which, again, are usually just text files that you modify using a simple text editor).
Even if you have a stand-alone Linux system you can use this multi-user capability. When the system boots up and presents you with a login prompt, you're actually using only the first of several available terminal sessions. Once you log in, simply hold down the Alt key on your keyboard and press the F2 key. You'll see another login prompt. This is the second terminal session. You can log in here using a different user name. Do an Alt-F3 and you'll yet another login prompt. These type of terminal sessions are also called "consoles" or "virtual terminals". Using multiple consoles, logged in as root on one and some other user on another, is helpful when you want to adjust the level of access to certain files or directories for users. You can adjust the file permissions using the root console and test the effects of the adjustment by switching over to the "regular" user's console.
And this multi-console capability is not limited to virtual terminals. You can connect dumb terminals to the serial ports of the Linux PC and simply uncomment a couple lines in the /etc/inittab configuration file to get them to bring up their own console sessions. Instead of dumb terminals you could also use PCs running a terminal program like HyperTerm to connect a serial port on the terminal PC to a serial port on the Linux PC. (A PC-to-PC connection would require a null-modem cable. Dumb terminals may or may not need a null-type cable depending on their interface.) Since most PCs have two serial ports, three people could all be using the same Linux system simultaneously.
Some user accounts are set up automatically for certain services instead of users. For example, if you set up a Linux system as an FTP server, a user account with the user name ftp is created. Anyone who uses "anonymous FTP" to access the server is doing so using this ftp user account.
Files and Such | |
Before getting into files there is one very important thing you must know about Linux/UNIX:
Linux/UNIX IS case-SENSITIVE !!!
When you see examples of commands, etc. on these pages, they must be entered exactly as shown. For example, a -f will have a totally different meaning than a -F in a Linux/UNIX command. Case-sensitivity also applies to passwords and file names. All of the following file names would be different files under Linux/UNIX:
Linux/UNIX treats everything like a file. When it's writing to your screen it thinks it's just writing to a file. When it sends data through a modem it thinks it's just writing to a file. As a result, all your hardware, including ports, hard-drives, video cards, etc. on your system must be represented somehow somewhere in the file system. Off of the root of the file system is a directory called /dev as in "devices". In this directory you will find a lot of different files all relating to hardware. These files are device drivers, not unlike the device drivers you use with Windows. It's the device driver file that handles the communication and data transfer with the actual piece of hardware.
It's good to know how Linux labels IDE hard-drives. If you're not aware of it, most systems have two IDE "channels", primary and secondary. Each channel can have two hard-drives attached to it, a "master" and a "slave" (which is why you have to look at the jumpers on IDE hard-drives when you install them). Linux refers to these drives this way:
Channel | Drive | Linux ID |
Primary | Master | hda |
Primary | Slave | hdb |
Secondary | Master | hdc |
Secondary | Slave | hdd |
If you have multiple partitions on a single physical drive, each partiton is given a number which is appended to the above drive designation. For example, if you had three partitions on your first hard-drive, you would have hda1, hda2, and hda3. In order to access these partitions, they have to be "mounted". At boot-up Linux will automatically mount any partitions you created during installation.
Because a DVD-ROM drive is a "removable" storage device, you may find that you can't access a DVD after inserting it. You have to manually enter a command to "mount" the DVD-ROM drive before you can access it. On my system, the DVD-ROM drive is the first drive (master) on the secondary IDE channel. As a result, the command I use to mount my DVD-ROM drive is:
mount -t iso9660 /dev/hdc /cdrom
I know this looks a little cryptic at first but it's really quite simple.
- mount makes a device part of the file system.
- -t iso9660 specifies the format of the file system being mounted. (The iso9660 is the standard format for data CDs (and most DVDs) but would be msdos if we were mounting a floppy drive with a DOS-formatted floppy in it.)
- /dev/hdc is the path to the DVD-ROM drive's device driver file. The c in the hdc indicates the first hard-drive on the secondary IDE channel. With SCSI hard-drives the third hard-drive would be sdc.)
- /cdrom is the directory to "map" the device to in the file system so it can be accessed. This has to be an existing directory but it can actually be any directory you want. You could use the mkdir command to create a directory called "shiny-spinning-thing" off the root of the file system and replace /cdrom with /shiny-spinning-thing in the above command if you wanted to.
Using the above mount command simply maps the DVD-ROM drive to the /cdrom directory (which was created during the installation). The directory a device gets mapped to is called the "mount point". As such, in order to access the files on the DVD-ROM once it's been mounted you just go to the mount point its been mapped to by entering
cd /cdrom
and use the ls command to view a list of the files on it. If you get an error along the lines of:
kernel does not recognize /dev/hdc
it's likely your DVD-ROM drive is connected as the slave on the primary IDE channel (i.e. it's /dev/hdb).
Tip: The commands to mount, and list the files on, a DOS-formatted floppy would be:
mkdir /floppy mount -t msdos /dev/fd0 /floppy ls /floppy |
Note that Debian creates the /cdrom directory off the root of the file system during installation but not the /floppy directory. You have to create that yourself.
Other Linux distros and UNIX more often put mount point directories under the /mnt directory. In order to mount a DVD drive on these systems you simply change the target directory in the command:
mount -t iso9660 /dev/hdc /mnt/cdrom
Just as you mounted the removable disk to access it, you have to unmount it when you are done. Pressing the eject button on the DVD drive won't open the tray until you do unmount the drive. For this you just use the umount and specify its mount point in the file system:
umount /mnt/cdrom
Another thing to note about dealing with files in Linux/UNIX is that file extensions mean nothing to the OS. Recall that, as a carry-over from DOS, many files in Windows have a three-character file extension and that this extension is separated from the file name by a period when the file is specified (ex: word.exe). Windows knows a file is a program ("Application") type of file because it has a .EXE extension. In Linux/UNIX there are no extensions. The file name can contain periods but what comes after the period is not an extension to Linux/UNIX.
Note that some Linux/UNIX applications may use a certain set of characters after a period in the file name to specify their data files. For example, the Apache Web server software looks for files that end with .htm, .html, and .shtml and these could be thought of as extensions. Technically however, they're not. And to the Linux/UNIX OS they mean absolutely nothing.
You could name a file this.is.a.file if you wanted to. It's all the same to the OS. So how do you tell Linux/UNIX that a file is a program (application)? Linux/UNIX has a set of "permissions" for each file. These permissions are read, write, and execute. You simply grant the execute permission to a file that is a program or script. You could grant the execute permission to a file that's not a program or script, but since the OS will try and execute whatever statements are in the file as if they were shell script commands, you'll likely end up with a lot of error messages. Depending on what's in the file, you could also end up with disastrous results like a trashed hard-drive.
I won't go into permissions in detail here. It's one of the key points to learn about the Linux/UNIX OS and just about every book on UNIX or Linux covers it. I just wanted to make you aware of them and how they relate to the way you can name files. If you're a Webmaster you may have already worked with permissions. When you use an FTP program to set permissions on CGI scripts and their data files you are using the Linux/UNIX chmod command that sets file permissions.
If you've worked with DOS you can make your Linux experience a little easier, type in the following command at the shell prompt:
alias dir="ls -laF"
This lets you use the familier DOS dir command instead of the UNIX ls command to list files. The ls command without any parameters gives a very simple listing which doesn't even indicate which items are directories and which are files. To get a good detailed listing you need to use ls -laF but that's a lot to type all the time. After issuing the above command, typing in dir at the shell prompt will produce a result like this:
drwxrwxrwx 3 keith web 4096 Aug 8 03:59 ./ dr-xr-sr-x 3 keith web 4096 Aug 6 13:56 ../ -rw-r--r-- 1 keith web 17181 Aug 6 16:04 bdl21dlx.zip -rwxr-xr-x 1 keith web 15818 Aug 6 16:04 bdlogger.cgi* -rw-r--r-- 1 keith web 1 Aug 6 16:04 history.log -rw-r--r-- 1 keith web 1 Aug 6 16:04 pagehits.cnt -rw-r--r-- 1 keith web 1 Aug 6 16:04 period.log -rw-r--r-- 1 keith web 30586 Aug 6 16:04 readme.txt -rw-r--r-- 1 keith web 1 Aug 6 16:04 trigger.dat drwxrwxr-x 2 keith web 4096 Aug 8 03:59 zips/ |
The / after "zips/" indicates it's a directory (as does the "d" in the first column of the permission block on the left). The * after the bdlogger.cgi file name indicates it's flagged as executable. The -rwxr-xr-x (which is 755) in the permission block for the bdlogger.cgi file also indicates that it's flagged as executable (x).
Also be aware that Linux/UNIX does use the period in file names for one special circumstance. File names that start with a period are usually configuration files. Normally every user will have a file called .profile in their home directory on a Linux/UNIX server. In this file are commands which set up the user's environment (default shell, values for environmental variables, etc.). It is somewhat like the config.sys file in DOS. The vi text editor has its own configuration file. So do the character-based versions of telnet and ftp that come with Linux/UNIX and using their configuration files allow you to automate the use of these programs. (For example, you could set up an ftp profile file and use the cron memory-resident scheduler to kick off ftp and automatically download a log file from your Website every night.) If you just use ls to list files, the files that start with a period do not get displayed (which is another reason to use the ls -laF command).
Speaking of home directories, every time a user account is created a home (personal) directory is also created for them. The home directory will have the same name as the username and it's located under the /home directory. If you want to return to your home directory from anywhere in the Linux/UNIX file system, just type in cd and hit Enter. FYI: If you've ever used a program like WS_FTP to make an anonymous ftp connection to an ftp server, you've probably seen several folders (bin, etc, lib) with one called pub (for public download files). The path to the directory where these folders are located, in other words the home directory for anonymous ftp users, is /home/ftp
It takes awhile to learn the Linux file system structure and that can make finding certain files a tough proposition. Luckily there are two commands you can use to locate files.
If you worked with DOS you're probably familier with the "path". The path is just a list of directories (folders for you Windows folks). If you tried to run a program by typing it's name at a DOS prompt, DOS would look in the current ("working") directory first. If it didn't find the program there it would read the path and start looking into each one of the directories specified in the path trying to find the program's .EXE file. If it found it it would run it. If DOS didn't find the file it would return the all-too-well-know message:
Bad command or file name
The file could have been on the drive somewhere, but if it wasn't in the current directory, or any of the directories listed in the path, you got the message.
Linux/UNIX has a path too. (Actually, each user that logs into the system has their own path that they can tailor to their needs.) The directories in the path are those that are the defacto standards for storing program files. The standard directories for storing executable binary files are:
The whereis command will search the directories in the path and tell you if the file you specify is in any of them, and if so which one.
whereis ls
returns
ls: /bin/ls
meaning the ls file is in the /bin directory. If it couldn't find ls it would simply return:
ls:
whereis is only good for finding out which directory in the path files (typically program files) are located. It also may not work on UNIX machines (try the which command on those).
You may get back several paths which indicates that a program (usually different versions of it) are installed in different places. For example, often times Web hosting companies will have two installations of the Perl interpreter on their Web server systems to support a wider range of CGI scripts. However, the list just shows you where the multiple copies are located. If you simply type in the name of the program at the shell prompt you won't know which one is actually getting executed. For that you have to look at the order of the directories in the path. Remember that the system goes through the directories in the order listed in the path running the program from the first directory where it finds the file. To see the path, use the command:
echo $PATH
To find any kind of file (not just program files) anywhere on the hard-drive (not just in the path), use the find command. With this command you specify a starting point and the name (or partial name) of a file. For instance, if you wanted to search the entire hard-drive you'd specify the root of the file system as the starting point like so:
find / -name 'ls'
If you're in a directory and you just want to find out if the file is in the current directory or any of its sub-directories you'd use:
find . -name 'ls'
The . (single period) in Linux/UNIX is like shorthand for "the current directory" and can be used in commands. Two periods (..) means the parent directory (one level up) and can also be used in commands (so you don't have to type in the entire path).
If you want to find out how much hard-drive space your files are taking up, use the:
df
command. The Use% figure will tell you. The numeric values given are for blocks.
The cat command is the equivalent of the DOS type command. It types out the contents of a file to the screen. You don't want to cat a binary file to the screen because you'll just get a bunch of garbage on the screen accompanied by a lot of beeping and possibly totally hose up your display. Use it with text files only. However, if a file is longer than 25 or so lines only the lowest 25 lines will be displayed. The rest just scrolls off the top of the screen. Better to use the more command which does the same thing, except it pauses the display every 25 lines so you can get a look at what's in the file. Press the Space Bar to get the next screen-full. Pressing the Enter key will advance the display one line at a time.
One of the keys to maximizing the automation capabilties is the ability to "chain" the execution of programs. The output of one program can be "piped" into another for it's execution. A simple example of this is:
ls | more
If you use the ls to look at a list of files in a directory which contains a lot of files, you'll miss most of them as their names scroll off the screen. By piping the output of the ls command into the more command the list of file names will get paused. The pipe symbol ( | ) is usually the Shifted character over the \ key on most PC keyboards.
This is a very simple example. You can do some major automation once you get good with Linux commands. (Also check out the grep and sort commands to add to your bag of automation tricks.)
Where to learn more - The best of our bookshelves:
|
More info... | | The Linux Cookbook is based on the Debian distro. It is a good introductory book that could be considered a Linux "Owners Manual" because it covers the operation of the OS but never gets under the hood (doesn't get into the server or networking aspects of Linux). It teaches you how to use the OS. (The sub-title is "Tips and Techniques for Everyday Use".) Thankfully, the majority of the book covers the use of commands at the shell prompt. There are 32 bite-size (10 to 15 page) chapters, each containing a lot of short recipes on how to accomplish specific tasks. No less than eight chapters deal with working with text, which is criitcal if you want to get good using Linux/UNIX. Use of some GUI apps for graphics, etc. are also covered. The "Productivity" section has five chapters which present a lot of good info on disks, printing, and working with other platforms. |
Another useful tool for automation is the "redirect". Things that normally get displayed on the screen (the default output device) can be redirected to a text file or to a device like a printer or modem (which Linux/UNIX thinks is a file). The greater-than sign (>) is used to redirect output. For example, if you wanted to redirect your file listing to a printer you'd use:
ls > lp0
Another painfully simple example but you'll likely see these two characters in example commands so you should know what they're doing. Using piping and redirects in conjunction with the wealth of Linux/UNIX commands available will allow you to set up a system that'll do everything but make coffee (and I wouldn't doubt some engineering student somewhere got a Linux system to do that too).
Linux/UNIX file systems support things called symbolic links, more commonly referred to as "symlinks". These are the equivalent to shortcuts on Windows systems. They are most often used to create symlinks to binary executable files and data files so that they appear to be located in many different areas on a hard-drive. However, you can also create symlinks to entire directories so that these directories (and their contents) are accessible via different locations within the file system (for example, in the home directories of certain users). The advantage of symlinks is that any changes need to be made to the one file or directory that all the symlinks point to rather than having to make changes to multiple copies of the same file or directory.
There are actually two types of symbolic links, hard and soft with soft symlinks being much more common. The Windows shortcut analogy refers to soft symlinks. Hard symlinks create an actual copy of the file of directory. When you use the ls command to look at a list of files in a directory, you can tell which ones are symlinks because they'll use the -> characters to point to the original file. As you'll see in the next section, symlinks are commonly used in the startup directories. When a system is started at a particular runlevel it runs the boot-up shell scripts located in the startup directory for that runlevel (each runlevel has its own directory). Many scripts are run no matter which runlevel is used. Instead of putting copies of these scripts in all of the runlevel startup directories, all the scripts are put in one directory and only symlinks are put in the individual runlevel startup directories.
For example, the shell script to start up the cron background scheduler is called cron and it's run at all runlevels. It is located in the /etc/init.d directory which holds all the startup scripts ("init.d" is the geek abbreviation for "start daemons"). If you were to use the ls command to look at the files in the startup directory for runlevel 2 (rc2.d), you would see a soft symlink to the cron startup script:
S89cron -> ../init.d/cron
Again, the advantage is if changes would need to be made to the cron script one would just edit the actual script file and they would take affect at all runlevels. (It's also a good way to cut down on disk space usage.) One important point to note is that even symlinks have to have file permissions set on them. They do not "inherit" the permissions of the file they point to. If a symlink points to an executable shell script, the symlink itself must also have the eXecutable permission applied to it.
Starting Up | |
All of the Linux distros and flavors of UNIX can be grouped into two families due to the evolution of UNIX; "System V" (System Five) and "BSD". This knowledge is useful only in that the two families use different directory structures for their startup files, and you'll want to know what files are used during the startup process so you can set certain processes to start up automatically when the system boots up.
Debian and most other Linux distros are part of the System V family. A couple releases back Sun Microsystems changed their Solaris operating system to be a part of the System V family also. As a result, knowing this startup directory system structure is useful for most Linux/UNIX systems. There may be some variations in directory names, etc. among different distros but the concept is the same for all.
When you boot your system, services and processes are started via shell scripts. All shell scripts that could possibly get executed when you boot your system are stored in the /etc/init.d sub-directory. (Note that even directories can have periods in their name.)
In order to understand the boot-up process you have to be familier with runlevels. Linux/UNIX systems can be set to run in different modes of functionality. They can operate in a single-user mode, such as in the case of strictly being a "workstation" (desktop PC), or they can run in multi-user mode to operate as a server. Each runlevel is identified by a single-digit number. The runlevels worth remembering are:
0 - shut down the system
1 - single-user mode
2 - 5 multi-user mode
6 - reboot
Runlevel 2 is Debian's default. Having several different multi-user runlevels means that you can customize them. For example, you could disable NFS file sharing in runlevel 2 because that's not something you want enabled on an Internet server. If you're planning on setting up a file server for your internal network, you could then change the default runlevel to 3 which would still have NFS file sharing. Initially there's no difference between the four multi-user runlevels because they're all set up the same (all start the same services during bootup).
You can change to different runlevels on the fly using the init command followed by the desired runlevel. This can be useful, for example, if you wanted to restart the system remotely. You couldn't use the Ctrl-Alt-Del key sequence to restart a system you were connected to via a telnet session. Instead you would just use the init 6 command. Of course, doing so would disconnect your telnet session, but you could telnet back in once the system has had a chance to reboot.
Recall that any shell script that might be run at system startup are stored in the /etc/init.d directory. All these different runlevels do is run a different set of the scripts stored in this directory at startup. When the system boots up the first startup file it reads is the /etc/inittab text file. This file basically tells the system what the default run level is via the line
id:2:initdefault:
This is the line in the /etc/inittab file you need to edit if you want to change the default run level. Most distros default to runlevel 3 which is not a secure thing to do for Internet servers.
Each runlevel also has it's own subdirectory under the /etc directory, and the subdirectory name contains the runlevel number. The naming convention for these subdirectories is:
rc2.d
with the runlevel represented as the number (in blue) above.
These runlevel subdirectories don't actually contain any scripts. Rather, they all contain symbolic links to the scripts in the /etc/init.d subdirectory. This is so all the different runlevels can share common scripts eliminating the need to have multiple copies of the same script in multiple runlevel subdirectories. (That's why any script that could be run at startup is saved to the /etc/init.d subdirectory.)
Initially (on a Debian system), all the different runlevel subdirectories contain the same set of links to the same scripts in the /etc/init.d subdirectory. As a result, the system is functionally the same no matter which multi-user runlevel is chosen. You customize the system's behavior for each runlevel by adding and/or deleting links in the different runlevel subdirectories.
For example, if you want runlevel 2 to be for Internet services, you could go into the /etc/rc2.d subdirectory and delete the links to the NFS scripts. Then you could go into the /etc/rc3.d subdirectory and delete all the links to the Internet services scripts (like Apache, Sendmail, and inetd) so that runlevel 3 was set for network file sharing.
It all seems confusing the first time you find out about it. Here's a diagram of the process. Note that special single-user runlevel is always run at bootup to set up the basic system. The higher runlevel scripts are additionally run to provide the multi-user/server functionality.
There's another reason for using links in the individual runlevel subdirectories. These links use a special naming convention to indicate when they should be run and in what order they should be run. Links that have names starting with an upper-case S are called when the runlevel is entered (starting). Links that have names starting with an upper-case K are called when the runlevel is exited to Kill services. (Due to this naming convention it is more common to simply rename a symbolic link so that it begins with a dash (-) or underscore (_) rather than delete them when customizing a runlevel.)
The directories for the multi-user runlevels only contain links that start with an S. Only those run levels that deal with shutting down or restricting functionality of the system (0, 1, 6) have links that start with a K to kill services.
The numbers after the leading S or K determine the order in which the links are called, lowest number first. This is important for process dependancy reasons. For example, you wouldn't want to start up the Samba service if the networking service wasn't already running. For example:
S20thisservice
S30thatservice
S40anotherservice
S80yetanother
You'll see how to add your own script to the startup process near the end of the Proxy/NAT page. When you add a link to one your scripts you typically use a high number (70s and above) to make sure everything your script needs is already running.
Shutting Down | |
You can't just turn off a Linux system like you would a DOS system. It has to be "shut down". If you want to turn the system off, there are several shutdown commands you can use but with most Linux distros (most UNIX flavors don't support this) I find it much easier just to hit Ctrl-Alt-Del. Unlike a DOS system, doing this will not immediately reboot the system. It will first shut down all processes and dismount the file systems. Once it starts to reboot you can turn the system off.
If you want to keep the system running but not be logged in (as in the case of using the system as a server), just type in exit at the shell prompt. This will log you out and return to the login screen.
Note: You should never leave an Internet-connected server console logged in as anyone, especially root.
Source and Binaries | |
Linux, like every other OS, works with two types of files, binary and text. Binary files appear as gibberish to us mere mortals. They can be programs or data files written by programs or the operating system. Text files are those that can be read by us human beings. We can view and change them using any text editor (like Notepad in Windows).
The files that Linux, and every other OS, works with can fall into one of two different categories, programs or data. Either a file contains a set of instructions to be executed by the CPU (a program) or it contains information (data). Program (or "application") files can be anything from OS commands to a word processor. Data files can hold information for programs (like configuration information) or for us humans (like a word processing document or some database).
The above two categories are admittedly broad, but it helps to think of all files as fitting into one or the other category. There is no relationship between types of files and categories of files. To illustrate this, consider the following:
- Binary files can be programs (like .EXE files on a Windows)
- Binary files can be data (like graphics files created using a program like Paintbrush)
- Text files can be programs ("script" or "batch" files that typically contain a series of OS commands)
- Text files can be data (Linux makes heavy use of text files to store configuration information for programs and programs like Excel can "export" data to a text file)
Most programmers would probably whince at referring to a text batch file as a "program", but they do contain a series of commands that are executed by the CPU so for categorization purposes they fit the bill.
You can't tell just by looking at an ls listing what types the files are. The Linux file command will tell you:
file /bin/ls
If a file is a binary file the file command will return data or a long string with something like ELF 32-bit somewhere in it (ELF stands for Executable and Link Format). If a file is text it will return a short string with the word text in it.
The Linux operating system actually comes in two forms. The binaries are the pre-compiled, ready-to-run OS files that you install on your PC, similar to installing the Windows OS. In addition, the "open source" nature of Linux means that the source code files of the operating system (what the programmers wrote) is also freely available. Source code files are text files. A programmer types programming statements into a text editor, saves the text file, and then "compiles" it to generate a binary file which is the equivalent of a Windows .EXE file.
Someone who is good at C programming (the language Linux is written in) can open one of the Linux source code files in a text editor, customize it to add special functions or otherwise tailor it to their needs, and then recompile it. Conversely, the source code for Windows is a closely-guarded secret and the Windows CDs you buy from Microsoft only contain the binaries. (Now you know why the term "open source" is often used to refer to Linux and Linux applications.) As a result of this, Linux is also a fully customizable operating system. If you don't like the way Linux does something, you can change it any way you want by altering the source code and re-compiling it. That's one reason "embedded" Linux is often found serving dedicated functions controlling devices like the TiVO DVR (Digital Video Recorder).
The open source nature of Linux and open Linux applications also keeps people honest because anyone can view the source code and see exactly what a program is doing (the "many eyes" theory). The closed source nature of Windows and Windows programs does not offer this benefit. When you go to microsoft.com and use the Windows Update feature, you have no way of knowing what is actually being checked or what information the update application is extracting from your system. Only a select few at Microsoft know this and they ain't talkin'.
All source code files (not just Linux source code files) can be converted into the executable binaries by compiling them. In the Linux/UNIX world you will often see applications and utilities distributed in source code (text file) format. This is because there are many different "flavors" and versions of Linux and UNIX to run on the wide variety of hardware platforms out there. The source code for an application or utility compiled on one Linux/UNIX system will generate a binary file that may not run on a system running a different flavor of Linux or UNIX. Compiled binaries may also be platform-specific. This is why you'll see different Linux disc sets offered for different hardware platforms (i386-Intel, m68K-Mac, Sparc-Sun, ppc-PowerPC). Windows is only compiled to run on the Intel (and compatible) platform. So Linux gives you the advantage of being able to run the same OS on all of your different hardware platforms, including S/390 mainframes.
So the person or company that wrote the application or utility has two choices. They can either make a wide variety of binaries available for all of the different flavors of Linux/UNIX (which means they would have to have a wide variety of machines and compile it on each one of them), or they can just make the source code available and let the users compile it themselves on their particular machines. That's why many flavors of Linux and UNIX include a C compiler, and why many free Linux/UNIX applications and utilities are only available in source code format. Debian makes pre-compiled binary disc sets available for a wide variety of hardware platforms and also has a source disc set which includes the source code for all of the packages included with Debian.
Packages and Tarballs | |
Using a C compiler to generate binaries from source files is often one of the more frustrating things about learning to use Linux/UNIX. There are as many different C compilers out there as their are flavors of Linux/UNIX and some work better than others. This frustation is what led to the development of packages. Instead of manually compiling and copying all of the files associated with a new application into the appropriate directories, a "package manager" is used to help automate the process. Red Hat developed the Red Hat Package Manager which uses .RPM files, and Debian has a package manager that uses .DEB files. (We show you how to work with Debian packages on the Packages page.) Many feel the Debian package manager is easier to use, and it allows you to pull packages and package updates over the Internet.
This doesn't mean you want to restrict yourself to packages entirely. There are many good utilities out there that do not come in a package format. There are a lot of Web sites that offer Linux/UNIX utilities for download. When you go looking for a particular utility, check to see if it's available as a binary for your flavor of Linux/UNIX. If not, the source will be available.
The files you actually download will typically be compressed into an archive, much like PKZip or WinZip files you can download for Windows systems. There is even a zip-like program for Linux/UNIX called gzip (GNU zip). (Most Linux and UNIX distros include the gzip/gunzip utilities.) gziped files typically have .gz at the end. The command to uncompress and extract a gzip-ed file is:
gunzip file-name.tar.gz
PKZip and WinZip do two things. They combine multiple files into a single archive and then they compress it. With Linux/UNIX this is a two step process. gzip handles the compression and tar handles the combining. tar is the utility that combines/extracts multiple files but it doesn't do any compression. Some download files aren't compressed because they're not all that big. If the file you download is an uncompressed archive of multiple files, it will likely have a .tar extension. Such files are called "tar balls". The command to extract a tar ball is
tar -xvf file-name.tar
One thing to note about tar is that it maintains the directory tree structure during extraction if subdirectories are included in the combining process.
A file that has been combined and compressed may have two "extensions". If you see a file name that ends in .tar.gz or simply .tgz you have to first gunzip it, which will remove the .gz from the end of the file name. Then you extract that file using tar.
Newer versions of tar can handle both the uncompressing and extracting. If you have a file with a .gz extension try entering this command:
tar -zxvf file-name.tar.gz
The additional z switch will uncompress the file if you do have a newer version of tar.
Don't be shy about un-tar-ing and compiling source files. They are valuable skills to have in the Linux/UNIX world. Many device drivers are only available as source code files. We'll show you how to compile a chat application and a driver from source code on the Compiling Software page.
The Kernel | |
As mentioned early on in this page, the kernel is the guts of the operating system. It's the executable file that contains the main OS code. Every operating system has a kernel. The various distributions all use the Linux kernel.
New and impoved versions of applications are released with increasing major version numbers (ex: Netscape 3, Netscape 4, etc.), or minor version numbers (ex: Netscape 4.0, 4.5, 4.76). Similarly, newer versions of the Linux kernel are also occasionally released. Debian 3.1 ("Sarge") used the 2.4 kernel by default while Debian 4.0 ("Etch") started using the 2.6 kernel by default.
You should be aware of the kernel version (not just the distro version) you use because procedures for setting up various system functions often change with the kernel versions. For example, the Firewall page covers utilizing IPTABLES, the firewalling utility for the 2.4 and later kernels. (With the 2.2 kernel IPCHAINS is used.)
Because Linux is used a great deal as an operating system for servers, and because servers typically aren't upgraded as often or as quickly as workstations, support (in the way of Web resources, etc.) for older kernel versions is usually available for long periods after newer kernel versions are released. As a result, you don't have to feel rushed to upgrade if you're using an older version of a distro or a distro with an older version kernel.
There's one part of the Linux file system that's not really a file system at all. It's actually a window into the system's memory where you can see what the kernel sees. By using the commands:
cd /proc
ls -laF | more
you can view the various kernel processes currently in memory. While the listing may show many "files" as having a size of zero bytes, there is actually a lot of information in them. You can see this information by using the usual cat or more file commands to view a particular entry's information. Programs will often access information in this directory to read current system parameters and adjust their performance accordingly.
Many Linux books and publications cover recompiling a kernel. You obtain newer kernel source code and compile it into a newer kernel that supports some new hardware or feature. This isn't such a good idea anymore. Many distros customize the kernel a bit and replacing it with a "stock" kernel could cause problems. Plus the fact that you can usually upgrade the kernel using whatever package utility your chosen distro has. If you were put off by the thought of having to recompile the kernel, don't be. In addition, compiling support into the kernel for new devices isn't the preferred way of doing things anymore. These days you want to use modules. We'll get into modules on the Compiling Software page.
Who Da Man ! | |
The UNIX and Linux OSs have built-in, on-line help. They're called "man pages". All you have to do is type in man followed by the name of a command you want help with. For example:
man chmod
However, be advised that these pages were written by the same type of people who wrote the operating system. As a result, they're just this side of being understandable by us mere mortals. But they may help you out in a pinch. When you're done viewing a man page, just hit 'q' to quit.
In addition to man pages, through a volunteer effort the Linux community has produced an extensive list of instructional documents called HOWTOs. There's a HOWTO on just about any subject you can think of. Some of the HOWTOs are not as clear as most would like, but others are quite well written. The central repository for HOWTO documents is at the Linux Documentation Project Web site. A link to this site is given on the Internet Resources page.
Learning Linux isn't easy, especially if you never worked with DOS. And getting good with Linux will take time. However, the popularity of Linux continues to grow. Investing the time and effort to become proficient with Linux could pay huge dividends in the future.
There are a ton of Linux books out there, some of which are featured on these pages. The problem with books for those just starting out is that most beginner books don't get into networking or the server aspects of Linux. Books that cover the networking and server aspects of Linux tend to assume you're well-versed in the use of Linux.
As mentioned on the home page, we'll take the middle ground on these pages. We'll give wide but relatively shallow coverage of things ranging from basic Linux/UNIX commands to setting up a firewalling proxy server. If you go through these pages and follow along configuring your own system, you'll get a decent handle on what it's like working with Linux.
Did you find this page helpful ?
If so, please help keep this site operating
by using our DVD or book pages.
Site, content, documents, original images Copyright © 2003-2011 Keith Parkansky All rights reserved
Duplication of any portion of this site or the material contained herein without
the express written consent of Keith Parkansky, USA is strictly prohibited.
This site is in no way affiliated with the Debian Project, the debian.org Web site, or
Software In The Public Interest, Inc. No endorsement of this site by the Debian Project
or Software In the Public Interest is expressed or implied. Debian and the Debian logo
are registered trademarks of Software In The Public Interest, Inc. Linux is a registered
trademark of Linus Torvalds. The Tux penguin graphic is the creation of Larry Ewing.
LIABILITY
IN NO EVENT WILL KEITH PARKANSKY OR BLUEHOST INCORPORATED OR ANY OF ITS' SUBSIDIARIES BE LIABLE TO ANY PARTY (i) FOR ANY DIRECT, INDIRECT, SPECIAL, PUNITIVE OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION, LOSS OF PROGRAMS OR INFORMATION, AND THE LIKE), OR ANY OTHER DAMAGES ARISING IN ANY WAY OUT OF THE AVAILABILITY, USE, RELIANCE ON, OR INABILITY TO USE THE INFORMATION, METHODS, HTML OR COMPUTER CODE, OR "KNOWLEDGE" PROVIDED ON OR THROUGH THIS WEBSITE, COMMONLY REFERRED TO AS THE "ABOUT DEBIAN" WEBSITE, OR ANY OF ITS' ASSOCIATED DOCUMENTS, DIAGRAMS, IMAGES, REPRODUCTIONS, COMPUTER EXECUTED CODE, OR ELECTRONICALLY STORED OR TRANSMITTED FILES OR GENERATED COMMUNICATIONS OR DATA EVEN IF KEITH PARKANSKY OR BLUEHOST INCORPORATED OR ANY OF ITS' SUBSIDIARIES SHALL HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES, AND REGARDLESS OF THE FORM OF ACTION, WHETHER IN CONTRACT, TORT, OR OTHERWISE; OR (ii) FOR ANY CLAIM ATTRIBUTABLE TO ERRORS, OMISSIONS, OR OTHER INACCURACIES IN, OR DESTRUCTIVE PROPERTIES OF ANY INFORMATION, METHODS, HTML OR COMPUTER CODE, OR "KNOWLEDGE" PROVIDED ON OR THROUGH THIS WEBSITE, COMMONLY REFERRED TO AS THE "ABOUT DEBIAN" WEBSITE, OR ANY OF ITS' ASSOCIATED DOCUMENTS, DIAGRAMS, IMAGES, REPRODUCTIONS, COMPUTER EXECUTED CODE, OR ELECTRONICALLY STORED, TRANSMITTED, OR GENERATED FILES, COMMUNICATIONS, OR DATA. ALL INFORMATION, METHODS, HTML OR COMPUTER CODE IS PROVIDED STRICTLY "AS IS" WITH NO GUARANTY OF ACCURACY AND/OR COMPLETENESS. USE OF THIS SITE CONSTITUTES ACCEPTANCE OF ALL STATED TERMS AND CONDITIONS.