Perl Article Index for
Perl
Articles about
Perl
Website Links For
Perl
 

Information About

Perl




  Paradigm Multi-paradigm
  Year 1987
  Designer Larry Wall
  Latest Release Version 588
  Latest Release Date January 31 2006
  Typing Dynamic
  Influenced By AWK , BASIC , BASIC-PLUS , C , C++ , Lisp , Pascal , Python , Sed , Unix Shell
  Influenced Python , PHP , Ruby , ECMAScript
  Operating System Cross-platform
  License GNU General Public License , Artistic License
  Website http://wwwperlorg/


Perl is a Dynamic Programming Language created by Larry Wall and first released in 1987 . Perl borrows features from a variety of other languages including C , Shell scripting ( Sh ), AWK , Sed and Lisp .1

Structurally, Perl is based on the brace-delimited block style of AWK and C, and was widely adopted for its strengths in string processing and lack of the arbitrary limitations of many Scripting Language s at the time.2


HISTORY

Larry Wall began work on Perl in 1987, while working as a programmer at Unisys ,3 and released version 1.0 to the comp.sources.misc Newsgroup on December 18 1987 . The language expanded rapidly over the next few years. Perl 2, released in 1988, featured a better Regular Expression engine. Perl 3, released in 1989, added support for Binary Data streams.

Until 1991, the only documentation for Perl was a single (increasingly lengthy) Man Page . In 1991, '' Programming Perl '' (known to many Perl programmers as the "Camel Book") was published, and became the ''de facto'' reference for the language. At the same time, the Perl version number was bumped to 4, not to mark a major change in the language, but to identify the version that was documented by the book.

Perl 4 went through a series of maintenance releases, culminating in Perl 4.036 in 1993. At that point, Larry Wall abandoned Perl 4 to begin work on Perl 5.

Initial design of Perl 5 continued into 1994. ''The perl5-porters'' mailing list was established in May 1994 to coordinate work on porting Perl 5 to different platforms. It remains the primary forum for development, maintenance, and porting of Perl 5.http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/

Perl 5 was released on October 17 1994 . It was a nearly complete rewrite of the Interpreter , and added many new features to the language, including objects, references, Local (my) Variables , and modules. Importantly, modules provided a mechanism for extending the language without modifying the interpreter. This allowed the core interpreter to stabilize, even as it enabled ordinary Perl programmers to add new language features.

As Of 2007 , Perl 5 is still being actively maintained. Important features and some essential new language constructs have been added along the way, including Unicode support, Threads , improved support for Object Oriented Programming and many other enhancements. The latest stable release is Perl 5.8.8.

One of the most important events in Perl 5 history took place outside of the language proper, and was a consequence of its module support. On October 26 1995 , the Comprehensive Perl Archive Network (CPAN) was established as a Repository for Perl Modules and Perl itself. At the time of writing, it carries over 11,000 modules by over 5,000 authors. CPAN is widely regarded as one of the greatest strengths of Perl in practice.


Name


Perl was originally named "Pearl", after the Parable Of The Pearl from the Gospel Of Matthew . Larry Wall wanted to give the language a short name with positive connotations; he claims that he considered (and rejected) every three- and four-letter word in the dictionary. He also considered naming it after his wife Gloria. Wall discovered the existing PEARL Programming Language before Perl's official release and changed the spelling of the name.

The name is normally capitalized (''Perl'') when referring to the language and uncapitalized (''perl'') when referring to the interpreter program itself since Unix-like file systems are case-sensitive. Before the release of the first edition of ''Programming Perl'', it was common to refer to the language as ''perl''; Randal L. Schwartz , however, capitalised the language's name in the book to make it stand out better when typeset. The case distinction was subsequently adopted by the community.4

The name is occasionally given as "PERL" (for Practical '''E'''xtraction and '''R'''eport '''L'''anguage). Although the expansion has prevailed in many of today's manuals, including the official Perl Man Page , it is merely a Backronym . The name does not officially stand for anything, so spelling it in all caps is incorrect and is considered a Shibboleth (label of outsiders) in the Perl community.5 Several other expansions have been suggested, including Wall's own humorous ''Pathologically Eclectic Rubbish Lister''. Indeed, Wall claims that the name was intended to inspire many different expansions.6


The camel symbol


'' Programming Perl '', published by O'Reilly Media , features a picture of a Camel on the cover, and is commonly referred to as ''The Camel Book''. This image of a camel has become a general symbol of Perl.

O'Reilly owns the image as a trademark, but claims to use their legal rights only to protect the ''"integrity and impact of that symbol"''.http://perl.oreilly.com/usage/
O'Reilly allows non-commercial use of the symbol, and provides ''Programming Republic of Perl'' logos and ''Powered by Perl'' buttons.http://www.oreillynet.com/images/perl/


OVERVIEW


Perl is a general-purpose programming language originally developed for text manipulation and now used for a wide range of tasks including System Administration , Web Development , Network Programming , GUI development, and more.

The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal).perlintro(1) Man Page Its major features include support for multiple programming paradigms ( Procedural , Object-oriented , and Functional styles), automatic Memory Management , built-in support for text processing, and a large collection of third-party Modules .


Features


The overall structure of Perl derives broadly from C. Perl is procedural in nature, with Variable s, Expression s, Assignment Statement s, Brace -delimited Code Block s, Control Structure s, and Subroutine s.

Perl also takes features from shell programming. All variables are marked with leading Sigil s, which unambiguously identify the data type (scalar, array, hash, etc.) of the variable in context. Importantly, sigils allow variables to be interpolated directly into strings. Perl has many built-in functions which provide tools often used in shell programming (though many of these tools are implemented by programs external to the shell) like sorting, and calling on system facilities.

Perl takes Lists from Lisp, Associative Array s (hashes) from AWK, and Regular Expression s from sed. These simplify and facilitate many parsing, text handling, and data management tasks.

In Perl 5, features were added that support complex Data Structure s, First-class Function s (i.e., Closures as values), and an object-oriented programming model. These include Reference s, packages, class-based method dispatch, and Lexically Scoped Variables , along with Compiler Directive s (for example, the strict pragma). A major additional feature introduced with Perl 5 was the ability to package code as reusable modules. Larry Wall later stated that "The whole intent of Perl 5's module system was to encourage the growth of Perl culture rather than the Perl core."Usenet post, May 10th 1997, with ID 199705101952.MAA00756@wall.org.

All versions of Perl do automatic data typing and memory management. The interpreter knows the type and storage requirements of every data object in the program; it allocates and frees storage for them as necessary. Legal type conversions—for example, conversions from number to string—are done automatically at run time; illegal type conversions are fatal errors.


Design


The design of Perl can be understood as a response to three broad trends in the computer industry: falling hardware costs, rising labor costs, and improvements in compiler technology. Many earlier computer languages, such as Fortran and C, were designed to make efficient use of expensive computer hardware. In contrast, Perl is designed to make efficient use of expensive computer programmers.

Perl has many features that ease the programmer's task at the expense of greater CPU and memory requirements. These include automatic memory management; Dynamic Typing ; strings, lists, and hashes; regular expressions; introspection and an eval() function.

Wall was trained as a linguist, and the design of Perl is very much informed by linguistic principles. Examples include Huffman Coding (common constructions should be short), good end-weighting (the important information should come first), and a large collection of language primitives. Perl favors language constructs that are concise and natural for humans to read and write, even where they complicate the Perl interpreter.

Perl syntax reflects the idea that "things that are different should look different". For example, scalars, arrays, and hashes have different leading Sigils . Array indices and hash keys use different kinds of braces. Strings and regular expressions have different standard delimiters. This approach can be contrasted with languages like Lisp , where the same S-expression construct and basic syntax is used for many different purposes.

Perl does not enforce any particular programming paradigm (procedural, object-oriented, functional, etc.) or even require the programmer to choose among them.

There is a broad practical bent to both the Perl language and the community and culture that surround it. The preface to ''Programming Perl'' begins, "Perl is a language for getting your job done." One consequence of this is that Perl is not a tidy language. It includes many features, tolerates exceptions to its rules, and employs heuristics to resolve syntactical ambiguities. Because of the forgiving nature of the compiler, bugs can sometimes be hard to find. Discussing the variant behaviour of built-in functions in list and scalar contexts, the perlfunc(1) manual page says "In general, they do what you want, unless you want consistency."

Perl has several mottos that convey aspects of its design and use. One is ''" There's More Than One Way To Do It ."'' (TIMTOWTDI, usually pronounced 'Tim Toady'). Others are ''"Perl: the Swiss Army Chainsaw of Programming Languages"'' and ''"No unnecessary limits"''. A stated design goal of Perl is to make easy tasks easy and difficult tasks possible. Perl has also been called ''"The Duct Tape of the Internet"''.

There is no written specification or standard for the Perl language, and no plans to create one for the current version of Perl. There has only ever been one implementation of the interpreter. That interpreter, together with its functional tests, stands as a ''de facto'' specification of the language.


Applications


Perl has many and varied applications, compounded by the availability of many standard and third-party modules.

Perl has been used since the early days of the Web to write CGI scripts. It is known as one of "the three Ps" (along with Python and PHP ), the most popular dynamic languages for writing Web applications (which now also include Ruby ). It is also an integral component of the popular LAMP Solution Stack for web development. Large projects written in Perl include Slash , Bugzilla , TWiki and Movable Type . Many high-traffic websites, such as Amazon.com , LiveJournal .com, Ticketmaster .com and IMDb.com 7 use Perl extensively.

Perl is often used as a Glue Language , tying together systems and interfaces that were not specifically designed to interoperate, and for "data munging", i.e., converting or processing large amounts of data for tasks like creating reports. In fact, these strengths are intimately linked. The combination makes perl a popular all-purpose tool for System Administrator s, particularly as short programs can be entered and run on a single command line.

With a degree of care, Perl code can be made portable across Windows and Unix. Portable Perl code is often used by suppliers of software (both COTS and bespoke) to simplify packaging and maintenance of software build and deployment scripts.

Graphical user interfaces (GUI's) may be developed using Perl. In particular, Perl/Tk is commonly used to enable user interaction with Perl scripts. Such interaction may be synchronous or asynchronous using callbacks to update the GUI. For more information about the technologies involved see Tk , Tcl and WxPerl .

Perl is also widely used in finance and Bioinformatics , where it is valued for rapid application development and deployment, and the ability to handle large data sets.


Implementation


Perl is implemented as a core interpreter, written in C, together with a large collection of modules, written in Perl and C. The source distribution is, As Of 2005 , 12 MB when packaged in a Tar File and Compressed . The interpreter is 150,000 lines of C code and compiles to a 1 MB executable on typical machine architectures. Alternatively, the interpreter can be compiled to a link library and embedded in other programs. There are nearly 500 modules in the distribution, comprising 200,000 lines of Perl and an additional 350,000 lines of C code. (Much of the C code in the modules consists of character encoding tables.)

The interpreter has an object-oriented architecture. All of the elements of the Perl language—scalars, arrays, hashes, coderefs, file handles—are represented in the interpreter by C structs. Operations on these structs are defined by a large collection of macros, typedefs and functions; these constitute the Perl C API. The Perl API can be bewildering to the uninitiated, but its entry points follow a consistent naming scheme, which provides guidance to those who use it.

The execution of a Perl program divides broadly into two phases: compile-time and run-time.A description of the Perl 5 interpreter can be found in ''Programming Perl'', 3rd Ed., chapter 18 At compile time, the interpreter parses the program text into a syntax tree. At run time, it executes the program by walking the tree. The text is parsed only once, and the syntax tree is subject to optimization before it is executed, so the execution phase is relatively efficient. Compile-time optimizations on the syntax tree include function initiates compilation during runtime. Both operations are an implicit part of a number of others—most notably, the use clause that loads libraries, known in Perl as modules, implies a BEGIN block.

Perl has a context-sensitive Grammar which can be affected by code executed during an intermittent run-time phase.8 Therefore Perl cannot be parsed by a straight Lex / Yacc lexer/parser combination. Instead, the interpreter implements its own lexer, which coordinates with a modified GNU Bison parser to resolve ambiguities in the language. It is said that "only perl can parse Perl", meaning that only the Perl Interpreter (''perl'') can parse the Perl language (''Perl''). The truth of this is attested to by the persistent imperfections of other programs that undertake to parse Perl, such as source code analyzers and auto-indenters, which have to contend not only with the many ways to express unambiguous syntactic constructs, but also the fact that Perl cannot be parsed in the general case without executing it.

Perl is distributed with some 120,000 functional tests. These run as part of the normal build process, and extensively exercise the interpreter and its core modules. Perl developers rely on the functional tests to ensure that changes to the interpreter do not introduce bugs; conversely, Perl users who see the interpreter pass its functional tests on their system can have a high degree of confidence that it is working properly.

Maintenance of the Perl interpreter has become increasingly difficult over the years. The code base has been in continuous development since 1994. The code has been optimized for performance at the expense of simplicity, clarity, and strong internal interfaces. New features have been added, yet virtually complete backward compatibility with earlier versions is maintained. The size and complexity of the interpreter is a barrier to developers who wish to work on it.


Availability


Perl is Free Software , and is licensed under both the Artistic License and the GNU General Public License . Distributions are available for most Operating System s. It is particularly prevalent on Unix and Unix-like systems, but it has been ported to most modern (and many obsolete) platforms. With only six reported exceptions, Perl can be compiled from Source Code on all Unix-like, POSIX -compliant or otherwise Unix-compatible platforms.9 However, this is rarely necessary, as Perl is included in the default installation of many popular operating systems.

Because of unusual changes required for the Mac OS Classic environment, a special port called MacPerl was shipped independently.10

The CPAN carries a complete list of supported platforms with links to the distributions available on each.http://www.cpan.org/ports/


Windows


Users of Microsoft Windows typically install a native binary distribution of Perl11, most commonly ActivePerl . Compiling Perl from Source Code under Windows is possible, but most installations lack the requisite C compiler and build tools. This also makes it hard to install modules from the CPAN, particularly such modules that are partially written in C.

Users of the ActivePerl binary distribution are therefore dependent on the repackaged modules provided in ActiveState ’s module repository, which are precompiled and can be installed with PPM . Limited resources to maintain this repository have been cause for various long-standing problems in the past12.

To address this and other problems of Perl on the Windows platform, win32.perl.org was launched by Adam Kennedy on behalf of The Perl Foundation in June 2006. This is a community website for "all things Windows and Perl." A major aim of this project is to provide a production-quality alternative binary distribution that includes a C compiler and build tools, so as to enable Windows users to install modules directly from the CPAN. This distribution is known as Strawberry Perl.

Another way of running Perl under Windows is provided by the Cygwin emulation layer. Cygwin provides a Unix-like environment on Windows that includes Gcc , so compiling Perl from source is a more accessible option for users who take this approach.


LANGUAGE STRUCTURE

In Perl, the minimal Hello World program may be written as follows:

print "Hello, world!
"

This Print s the String ''Hello, world!'' and a Newline , symbolically expressed by an n character whose interpretation is altered by the preceding backslash.

The canonical form of the program is slightly more verbose:


#!/usr/bin/perl
print "Hello, world!
";


The hash mark character introduces a line. This tells Unix-like operating systems where to find the Perl interpreter, making it possible to invoke the program without explicitly mentioning perl. (Note that on Microsoft Windows systems, Perl programs are typically invoked by associating the .pl Extension with the Perl interpreter. In order to deal with such circumstances, perl detects the shebang line and parses it for switches,13 so it is not strictly true that the shebang line is ignored by the compiler.)

The second line in the canonical form includes a semicolon, which is used to separate statements in Perl. With only a single statement in a block or file, a separator is unnecessary, so it can be omitted from the minimal form of the program – or more generally from the final statement in any block or file. The canonical form includes it because it is common to terminate every statement even when it is unnecessary to do so, as this makes editing easier: code can be added to or moved away from the end of a block or file without having to adjust semicolons.


Data types


Perl has a number of fundamental , Array s, Hashes , Filehandle s and Subroutines :
  • A Scalar is a single value; it may be a number, a String or a Reference

  • An Array is an ordered collection of scalars

  • A hash, or Associative Array , is a map from strings to scalars; the strings are called ''keys'' and the scalars are called ''values''.

  • A File Handle is a map to a file, device, or pipe which is open for reading, writing, or both.

  • A subroutine is a piece of code that may be passed arguments, be executed, and return data


Most variables are marked by a leading Sigil , which identifies the data type being accessed (not the type of the variable itself), except filehandles, which don't have a sigil. The same name may be used for variables of different data types, without conflict.


# a scalar
@foo # an array
%foo # a hash
FOO # a file handle or constant
&foo # a subroutine. (The & is optional)


File Handle s and constants need not be uppercase, but it is a common convention owing to the fact that there is no sigil to denote them. Since 5.8 a scalar may be used as a file handle, and using this feature is encouraged in Damian Conway 's ''Perl Best Practices''.

Numbers are written in the bare form; strings are enclosed by quotes of various kinds.


= "joe";
= 'red';

= 42;
= "42";

# This returns true
if ( == ) { print "Numbers and strings of numbers are the same!"; }

= "The answer is "; # Variable interpolation: The answer is 42
= 'This device costs $42'; # No interpolation in single quotes

= "David Bowie's \"Heroes\""; # literal quotes inside a string;
= 'David Bowie\'s "Heroes"'; # same as above with single quotes;
= q(David Bowie's "Heroes"); # the quote-like operators q() and qq() allow
# almost any delimiter instead of quotes, to
# avoid excessive backslashing

=< This is my multilined string
note that I am terminating it with the "EOF" word.
EOF


Perl will convert strings into numbers and vice versa depending on the context in which they are used. In the following example the strings and are treated as numbers when they are the arguments to the addition operator. This code prints the number '5', discarding non number information for the operation, although the variable values remain the same. (The string concatenation operator is not +, but .)


= "3 apples";
= "2 oranges";
print + ;


Perl also has a boolean context that it uses in evaluating conditional statements. The following values all evaluate as false in Perl:


= 0; # the number zero
= 0.0; # the number zero as a float
= 0b0; # the number zero in binary
= 0x0; # the number zero in hexadecimal
= '0'; # the string zero
= ""; # the empty string
= undef; # the return value from undef


All other values are evaluated to true. This includes the odd self-describing literal string of "0 but true", which in fact is 0 as a number, but true when used as a boolean. (Any non-numeric string would also have this property, but this particular string is ignored by Perl with respect to numeric warnings.) A less explicit but more conceptually portable version of this string is '0E0' or '0e0', which does not rely on characters being evaluated as 0, as '0E0' is literally "zero times ten to the zeroth power."

Evaluated boolean expressions also return scalar values. Although the documentation does not promise which ''particular'' true or false is returned (and thus cannot be relied on), many boolean operators return 1 for true and the empty-string for false (which evaluates to zero in a numeric context). The ''defined()'' function tells if the variable has any value set. In the above examples ''defined()'' is true for every value except ''undef''.

If a specifically 1 or 0 result (as in C) is needed, an explicit conversion is thought by some authors to be required:


my = ? 1 : 0;


However, if it's known that the value is either 1 or ''undef'', an implicit conversion can be used instead:


my = + 0;


A list is written by listing its elements, separated by commas, and enclosed by parentheses where required by operator precedence.


@scores = (32, 45, 16, 5);


It can be written many other ways as well, some straightforward and some less so:


# An explicit and straightforward way
@scores = ('32', '45', '16', '5');

# Equivalent to the above, but the qw() quote-like operator saves typing of
# quotes and commas and reduces visual clutter; almost any delimiter can be
# used instead of parentheses
@scores = qw(32 45 16 5);

# The split function returns a list of strings, which are extracted
# from the expression using a regex template.
# This may be useful for reading from a file of comma-separated values (CSV)
@scores = split /,/, '32,45,16,5';

# "Idiomatic Perl": This notation uses postfix for operator
# and aliasing of the magic variable to the next value of the list
# during every iteration
push @scores, for 32, 45, 16, 5;


A hash may be initialized from a list of key/value pairs:


%favorite = (
joe => 'red',
sam => 'blue'
);


The => operator is equivalent to a comma, except that it assumes quotes around the preceding token if it is a bare identifier: (joe => 'red') is the same as ('joe' => 'red'). It can therefore be used to elide quote marks, improving readability.

Individual elements of a list are accessed by providing a numerical index, in square brackets. Individual values in a hash are accessed by providing the corresponding key, in curly braces. The $ sigil identifies the accessed element as a scalar.


# an element of @scores
{joe} # a value in %favorite


Thus, a hash can also be specified by setting its keys individually:


{joe} = 'red';
{sam} = 'blue';


Multiple elements may be accessed by using the @ sigil instead (identifying the result as a list).


@scores 3, 1 # three elements of @scores
@favorite{'joe', 'sam'} # two values in %favorite
@favorite{qw(joe sam)} # same as above


The number of elements in an array can be obtained by evaluating the array in scalar context or with the help of the $# sigil. The latter gives the index of the last element in the array, not the number of elements.


= @friends; # Assigning to a scalar forces scalar context

# This notation is sometimes discouraged, because it tends
# to be confused with comments.

$#friends; # The index of the last element in @friends
$#friends+1; # Usually the number of elements in @friends is one more
# than $#friends because the first element is at index 0,
# not 1, unless the programmer reset this to a different
# value, which most Perl manuals discourage.


There are a few functions that operate on entire hashes.


@names = keys %addressbook;
@addresses = values %addressbook;

# Every call to each return the next key/value pair.
# All values will be eventually returned, but their order
# cannot be predicted.
while ((, ) = each %addressbook) {
print " lives at
";
}

# Similar to the above, but sorted alphabetically
while my (sort keys %addressbook) {
print " lives at
";
}



Control structures

See Also: Perl control structures



Perl has several kinds of control structures.

It has block-oriented control structures, similar to those in the C, Javascript , and Java programming languages. Conditions are surrounded by parentheses, and controlled blocks are surrounded by braces:

''label'' while ( ''cond'' ) { ... }
''label'' while ( ''cond'' ) { ... } continue { ... }
''label'' for ( ''init-expr'' ; ''cond-expr'' ; ''incr-expr'' ) { ... }
''label'' foreach ''var'' ( ''list'' ) { ... }
''label'' foreach ''var'' ( ''list'' ) { ... } continue { ... }
if ( ''cond'' ) { ... }
if ( ''cond'' ) { ... } else { ... }
if ( ''cond'' ) { ... } elsif ( ''cond'' ) { ... } else { ... }

Where only a single statement is being controlled, statement modifiers provide a more concise syntax:

''statement'' if ''cond'' ;
''statement'' unless ''cond'' ;
''statement'' while ''cond'' ;
''statement'' until ''cond'' ;
''statement'' foreach ''list'' ;

Short-circuit logical operators are commonly used to affect control flow at the expression level:

''expr'' and ''expr''
''expr'' && ''expr''
''expr'' or ''expr''
  (The "and" And "or" Operators Are Similar To && And <nowiki></nowiki> But Have Lower "http://wwwinformationdelightinfo/information/entry/precedence" class="copylinks">Precedence , which makes it easier to use them to control entire statements)