Home » Java » Analysis of the working principle of Javac

Analysis of the working principle of Javac

In the process of

and Java developers exchange, found that some developers is difficult to clear the JDK, J2EE, J2EE, J2ME, JavaEE, JVM. To understand this, to talk about Javac
Javac is a compiler, can be a standard language into another language specification, the compiler will usually is to facilitate understanding of the language into machine language easy to understand the mission of the.Javac is the Java language first can convert a language identification for JVM, and then by the JVM JVM language conversion this machine can identify the current machine language, the upcoming Java source files into class files.

And basic structure of what work module in the Javac compiler before, I would like to recall some knowledge about compiler principle, the compiler steps. I try to use the most concise language to summarize the compilation principle of thick
First, read the source code, a byte of a byte read out these bytes. Find out what we define grammar words such as if, else, for, if to identify what is legal and what is not, this step is the lexical analysis. The results of lexical analysis is to find out some standard Token stream from the original
The second step, grammar analysis of the Token flow, this step is to check these words together is not consistent with the Java language specification, such as if is not followed by a Boolean expression. Results the syntax analysis is the formation of a grammar tree with abstract Java language specification, the class syntax tree can be according to the new rules we re organization, it is the function of the language vocabulary in a structured form of
Semantic analysis of third, syntax analysis, if successful, would take the grammar into simpler syntax. As some of the existing function method in Java is converted to if, else, for and other basic structure composed of keywords, which results in the formation of a step closer to the target language grammar rules, form abstract syntax tree an annotated
Finally through the bytecode generator will be annotated abstract syntax tree generated bytecode, the result is accord with JVM byte code
Each module is Javac in order to complete the above task. It is divided into four modules: lexical analyzer, syntax analyzer, semantic analyzer and target code generator
Lexical analyzer: the completion of this process in the ParseCompilationUnit method in JavacParser, the source code can be downloaded from OpenJDK to view. The method starts from the first character of the source file, check one by one character, according to the Java syntax specification are found in package, import, and the class definition, properties and methods of definition, will all the key words in this class to the Token class, all the items in any one, the upcoming Java source file transfer into the corresponding Token character stream, we construct an abstract syntax tree
Parser: the lexical analyzer analysis Token group built a more structured syntax tree, which is a word assembled into a word, a complete sentence. Specifically, according to certain rules by analysis of Token flow in each node, each grammar tree is a an instance of com.sun.tools.javac.tree.JCTree
Semantic analyzer: in this class the syntax tree then some processing to generate a Java byte code. Such as adding the default constructor to the class, check whether the variable has been initialized before use will merge some constants, variable type check operation, check all the operation syntax is reachable, check whether abnormal or capture throw. Finally get the perfect
syntax tree.
: traversal code generator final syntax tree generation Java bytecode, this step by the com.sun.tools.javac.jvm.Gen class to complete. To generate bytecode into two steps: first, the Java method in the code block into the command form conforms to the JVM syntax tree, JVM operations are based on the stack, all operations must be and out of the stack into the stack to complete. Then, according to the JVM file format will be output to the.

byte code with class extension file