is called definition section, it introduces any initial C program code we want copied into the final program
The next section is the rules section. Each rule is made up of two parts: a pattern and an action, separated by whitespace. The lexer that lex generates will execute the action when it recognizes the pattern. These patterns are UNIX-style regular expresions, a slightly extended version of the same expressions used by tools such as grep, sed and ed.
The first rule in our example (ch1-02.l) is the following:
[\t ]+ /* ignore whitespace */;
The [] indicate that any one of the characters within the brackets matches the pattern. The + means that the pattern matches one or more consecutive copies of the subpattern that precedes the plus.
The second part of the rule, the action, is simply a semicolon, a do-nothing C statement. Its effect is to ignore the input.
The next set of rules use | (vertical bar) action. This is a special action that means to use the same action as the next pattern, so all of the verbs use the action specified for the last one.
The array yytext contains the text that matched the pattern. This action will print the recognized verb followed by the string ": is a verb\n".
The last two rules are:
The pattern [a-zA-Z]+ is a common one: it indicates any alphanumeric string with at least one character. The - character has special meaning when used inside square brackets: it denotes a range of characters beginning with the character to the left of the - and ending with the character to its right.
The two rules that make our lexer work are:
Lex patterns only match a given input character or string once.
Lex executes the action for the longest possible match for the current input. Thus, lex would see island as matching our all-inclusive rule because that was a longer match than is.
The last line is the default case. The special character . (perioed) matches any single character other than a newline, and \n matches a newline character. The special action ECHO prints the matched pattern on the output, copying any punctuation or other characters.
The final section is the user subroutines section, which can consist of any legal C code. Lex copies it to the C file after the end of the lex generated code.
Grammars
To recognize specific sequences of tokens and perform appropriate actions, traditionally a description of such a set of actions is known as a grammar.
\.\n { state = LOOKUP;
return 0;
}
The backslash in front of the period quotes the period, so this rule matches a period followed by a newline. The other change we made to our lexical analyzer was to omit the main() routine as it will now be provided within the parser.
A Yacc Parser
The Rules Section
The rules section describes the actual grammar as a set of production rules or simply rules. Each rule consists of a single name on the left-hand side of the : operator, a list of symbols and action code on the right-hand side, and a semicolon indicating the end of the rule. By default, the first rule is the highest-level rule. That is, the parser attempts to find a list of tokens which match this initial rule, or more commonly, rules found from the initial rule. The expression on the right-hand side of the rule is a list of zero or more names. A typical simple rule has a single symbol on the right-hand side as in the object rule which is defined to be a NOUN. The symbol on the left-hand side of the rule can then be used like a token in other rules. From this, we build complex grammars.
sentence: subject VERB object { printf("Sentence is valid\n"); }
The action part of the block consists of a C block, beginning with { and ending with }. The parser executes an action at the end of a rule as soon as the rule matches. In our sentence rule, the action reports that we’ve successfully parsed a sentence. Since sentence is the top level symbol, the entire input must match a sentence. The parser returns to its caller, in this case the main program, when the lexer reports the end of the input. Subsequent calls to yyparse() reset the state and begin processing again.
When the parser sees invalid tokens, it calls yyerror(), which we provide in the user subroutines section, and then recognizes the special rule error. You can provide error recovery code that tries to get the parser back into a state where it can continue parsing. If error recovery fails or, as is the case here, there is no error recovery mode, yyparse() returns to the caller after it finds an error.
We introduced recursion into this grammar. Recursion, in which a rule refers directly or indirectly to itself, is a powerful tool for describing grammars, and we use the technique in nearly every yacc grammar we write.
Running Lex and Yacc
$ lex ch1-n.l
$ yacc -d ch1-m.y
$ cc -c lex.yy.c y.tab.c
$ cc -o example-m.n lex.yy.o y.tab.o -ll
The first line runs lex over the lex specification and generates a file, lex.yy.c, which contains C code for the lexer. In the second line, we use yacc to generate both y.tab.c and y.tab.h (the latter is the file of token definitions created by the -d switch).
Lanyon is an unassuming Jekyll theme that places content first by tucking away navigation in a hidden drawer. It’s based on Poole, the Jekyll butler.
Built on Poole
Poole is the Jekyll Butler, serving as an upstanding and effective foundation for Jekyll themes by @mdo. Poole, and every theme built on it (like Lanyon here) includes the following:
Lanyon is by preference a forward-thinking project. In addition to the latest versions of Chrome, Safari (mobile and desktop), and Firefox, it is only compatible with Internet Explorer 9 and above.
Download
Lanyon is developed on and hosted with GitHub. Head to the GitHub repository for downloads, bug reports, and features requests.
Howdy! This is an example blog post that shows several types of HTML content supported in this theme.
Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.
Curabitur blandit tempus porttitor. Nullam quis risus eget urna mollis ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.
Etiam porta sem malesuada magna mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.
Inline HTML elements
HTML defines a long list of available inline tags, a complete list of which can be found on the Mozilla Developer Network.
To bold text, use <strong>.
To italicize text, use <em>.
Abbreviations, like HTML should use <abbr>, with an optional title attribute for the full phrase.
Citations, like — Mark otto, should use <cite>.
Deleted text should use <del> and inserted text should use <ins>.
Superscript text uses <sup> and subscript text uses <sub>.
Most of these elements are styled by browsers with few modifications on our part.
Heading
Vivamus sagittis lacus vel augue rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.
Code
Cum sociis natoque penatibus et magnis dis code element montes, nascetur ridiculus mus.
Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.
Lists
Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.
Praesent commodo cursus magna, vel scelerisque nisl consectetur et.
Donec id elit non mi porta gravida at eget metus.
Nulla vitae elit libero, a pharetra augue.
Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.
Vestibulum id ligula porta felis euismod semper.
Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
Maecenas sed diam eget risus varius blandit sit amet non magna.
Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.
HyperText Markup Language (HTML)
The language used to describe and define the content of a Web page
Cascading Style Sheets (CSS)
Used to describe the appearance of Web content
JavaScript (JS)
The programming language used to build advanced Web sites and applications
Integer posuere erat a ante venenatis dapibus posuere velit aliquet. Morbi leo risus, porta ac consectetur ac, vestibulum at eros. Nullam quis risus eget urna mollis ornare vel eu leo.
Tables
Aenean lacinia bibendum nulla sed consectetur. Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Name
Upvotes
Downvotes
Totals
21
23
Alice
10
11
Bob
4
3
Charlie
7
9
Nullam id dolor id nibh ultricies vehicula ut id elit. Sed posuere consectetur est at lobortis. Nullam quis risus eget urna mollis ornare vel eu leo.
Jekyll is a static site generator, an open-source tool for creating simple yet powerful websites of all shapes and sizes. From the project’s readme:
Jekyll is a simple, blog aware, static site generator. It takes a template directory […] and spits out a complete, static website suitable for serving with Apache or your favorite web server. This is also the engine behind GitHub Pages, which you can use to host your project’s page or blog right here from GitHub.
It’s an immensely useful tool and one we encourage you to use here with Lanyon.