Friday, January 21, 2011

Java tutorials


Click on below links to download

The Java SE Tutorials primarily describe features in Java SE 6. For best results, download JDK 6.

What's New

As part of the Oracle Author Podcast Series, a podcast on the Java Tutorials has just been released. You can hear the 10-minute podcast through the Oracle Author Podcasts website.
Enjoy quizzes? Take a minute to answer this quiz about Java applets. Java Applets Quiz
The Java Tutorials are continuously updated to keep up with changes to the Java Platform and to incorporate feedback from our readers. Included in the most recent release:
  • For several months the Java SE Tutorials were not available as a download. We are happy to report that the very popular bundle is back and available through the link under the "Tutorial Resources" box to the right.
  • Do the Java Tutorials seem overwhelming to you? We have added a new Learning Paths page. Please let us know what you think!
  • The new Fork/Join page, part of the Concurrency lesson, describes how you can use the Fork/Join framework to take advantage of multiple processors. This feature is available now in the Java SE 7 release available on java.net.
  • The JDBC Basics lesson has been completely revamped, including updated sample code that you can download, compile, and run – the code has been configured for Java DB and MySQL. See Getting Started for more information.
  • Finally, a small, but notable, change is that the standalone JNDI tutorial, previously available on java.sun.com, has been moved to the Java SE documentation archive on download.oracle.com and many broken links in the tutorial have been fixed. The front page of the JNDI tutorial has been modified accordingly, and redirects are in place ensuring a seamless transition.

Trails Covering the Basics

These trails are available in book form as The Java Tutorial, Fourth Edition. To buy this book, refer to the box to the right.
  • Getting Started — An introduction to Java technology and lessons on installing Java development software and using it to create a simple program.
  • Learning the Java Language — Lessons describing the essential concepts and features of the Java Programming Language.
  • Essential Java Classes — Lessons on exceptions, basic input/output, concurrency, regular expressions, and the platform environment.
  • Collections — Lessons on using and extending the Java Collections Framework.
  • Swing — An introduction to the Swing GUI toolkit, with an overview of features and a visual catalog of components. See below for a more comprehensive tutorial on Swing.
  • Deployment — How to package applications and applets using JAR files, and deploy them using Java Web Start and Java Plug-in.
  • Preparation for Java Programming Language Certification — List of available training and tutorial resources.

Creating Graphical User Interfaces

This trail is available in book form as The JFC Swing Tutorial. To buy this book, refer to the box to the right.

Specialized Trails and Lessons

These trails and lessons are only available as web pages.

  • Custom Networking — An introduction to the Java platform's powerful networking features.
  • The Extension Mechanism — How to make custom APIs available to all applications running on the Java platform.
  • Full-Screen Exclusive Mode API — How to write applications that more fully utilize the user's graphics hardware.
  • Generics — An enhancement to the type system that supports operations on objects of various types while providing compile-time type safety. Note that this lesson is for advanced users. The Java Language trail contains a Generics lesson that is suitable for beginners.
  • Internationalization — An introduction to designing software so that it can be easily be adapted (localized) to various languages and regions.
  • JavaBeans — The Java platform's component technology.
  • JDBC Database Access — Introduces an API for connectivity between the Java applications and a wide range of databases and a data sources.
  • JMX— Java Management Extensions provides a standard way of managing resources such as applications, devices, and services.
  • JNDI— Java Naming and Directory Interface enables accessing the Naming and Directory Service such as DNS and LDAP.
  • JAXP — Introduces the Java API for XML Processing (JAXP) 1.4 technology.
  • RMI — The Remote Method Invocation API allows an object to invoke methods of an object running on another Java Virtual Machine.
  • Reflection — An API that represents ("reflects") the classes, interfaces, and objects in the current Java Virtual Machine.
  • Security — Java platform features that help protect applications from malicious software.
  • Sound — An API for playing sound data from applications.
  • 2D Graphics — How to display and print 2D graphics in applications.
  • Sockets Direct Protocol — How to enable the Sockets Direct Protocol to take advantage of InfiniBand.

Friday, January 14, 2011

Computers and Programming

Hardware and Software

computer is a device capable of performing computations and making logical decisions at speeds millions and even billions of times faster than human beings can. For example, many of today’s personal computers can perform hundreds of millions of additions per second.
Computers process data under the control of sets of instructions called computer programs. These computer programs guide the computer through orderly sets of actions specified by people called computer programmers.
A computer is comprised of various devices (such as the keyboard, screen, “mouse”, disks, memory, CD-ROM and processing units) that are referred to as hardware. The computer programs that run on a computer are referred to as software.

Computer Hardware

Almost every computer may be seen as being divided into six logical units. Figure 1 illustrates the main computer components.

Input Unit

This unit obtains information from various input devices and places this information at the disposal of the other units so that the information may be processed. The information is entered into computers today through keyboards and mouse devices.

Output Unit

This unit takes information that has been processed by the computer and places it on various output devices to make information available for use outside the computer. Most output from computer today is displayed on screens, printed on paper, or used to control other devices.

Memory Unit

The memory unit stores information. Each computer contains memory of two main types: RAM and ROM.
RAM (random access memory) is volatile. Your program and data are stored in RAM when you are using the computer.
Figure 1: Basic hardware units of a computer
Figure 1 (graphics1.png)
ROM (read only memory) contains fundamental instructions that cannot be lost or changed by the user. ROM is non-volatile.

Arithmetic and Logic Unit (ALU)

The ALU performs all the arithmetic and logic operations. Ex: addition, subtraction, comparison, etc.

Central Processing Unit (CPU)

The unit supervises the overall operation of the computer. The CPU tells the input unit when information should be read into the memory unit, tell the ALU when information from the memory should be used in calculations and tells the output unit when to send information from the memory unit to certain output devices.

Secondary Storage

Secondary storage devices are used to be permanent storage area for programs and data.
Virtually all secondary storage is now done on magnetic tapes, magnetic disks and CD-ROMs.
A magnetic hard disk consists of either a single rigid platter or several platters that spin together on a common spindle. A movable access arm positions the read and write mechanisms over, but not quite touching, the recordable surfaces. Such a configuration is shown in Figure 2.
Figure 2: The internal structure of a magnetic hard disk drive
Figure 2 (graphics2.png)

Computer Software

computer program is a set of instructions used to operate a computer to produce a specific result.
Another term for a program or a set of programs is software, and we use both terms interchangeably throughout the text.
Writing computer programs is called computer programming.
The languages used to create computer programs are called programming languages.
To understand C++ programming, it is helpful to know a little background about how current programming languages evolved.

Machine and Assembly Languages

Machine languages are the lowest level of computer languages. Programs written in machine language consist of entirely of 1s and 0s.
Programs in machine language can control directly to the computer’s hardware.
00101010 000000000001 000000000010
10011001 000000000010 000000000011
A machine language instruction consists of two parts: an instruction part and an address part.
The instruction part (opcode) is the leftmost group of bits in the instruction and tells the computer the operation to be performed.
The address part specifies the memory address of the data to be used in the instruction.
Assembly languages perform the same tasks as machine languages, but use symbolic names for opcodes and operands instead of 1s and 0s.
LOAD BASEPAY
ADD OVERPAY
STORE GROSSPAY
Since computers can only execute machine language programs, an assembly language program must be translated into a machine language program before it can be executed on a computer.
Figure 3: Assembly translation
Figure 3 (graphics3.png)
Machine languages and assembly languages are called low-level languages since they are closest to computer hardware.

High-level Programming Languages

High-level programming languages create computer programs using instructions that much easier to understand than machine or assembly language instructions.
Programs written in a high-level language must be translated into a low level language using a program called a compiler.
A compiler translates programming code into a low-level format.
High-level languages allow programmers to write instructions that look like every English sentences and commonly used mathematical notations.
Each line in a high-level language program is called a statement.
Ex: Result = (First + Second)*Third.
Once a program is written in a high-level language, it must also be translated into the machine language of the computer on which it will be run. This translation can be accomplished in two ways.
When each statement in a high-level source program is translated individually and executed immediately upon translation, the programming language used is called an interpreted language, and the program doing the translation is called an interpreter.
When all of the statements in a high-level source program are translated as a complete unit before any one statement is executed, the programming language used is called is called a compiled language. In this case, the program doing the translation is called a compiler.

Application and System Software

Two types of computer programs are: application software and system software.
Application software consists of those programs written to perform particular tasks required by the users.
System software is the collection of programs that must be available to any computer system for it to operate.
The most important system software is the operating system. Examples of some well-known operating systems include MS-DOS, UNIX, and MS WINDOWS. Many operating systems allow user to run multiple programs. Such operating systems are called multitasking systems.
Beside operating systems, language translators are also system softwares.

High-Level Programming Languages

Because of the difficulty of working with low-level languages, high-level languages were developed to make it easier to write computer programs. High level programming languages create computer programs using instructions that are much easier to understand than machine or assembly language code because you can use words that more clearly describe the task being performed. Examples of high-level languages include FORTRAN, COBOL, BASIC, PASCAL, C, C++ and JAVA.
C and C++ are two separate, but related programming languages. In the 1970s, at Bell Laboratories, Dennis Ritchie and Brian Kernighan designed the C programming language. In 1985, at Bell Laboratories, Bjarne Stroutrup created C++ based on the C language. C++ is an extension of C that adds object-oriented programming capabilities.

What is Syntax?

A programming language’s syntax is the set of rules for writing grammatically correct language statements. In practice this means a C statement with correct syntax has a proper form specified for the compiler. As such, the compiler accepts the statement and does not generate an error message.

The C Programming Language

C was used exclusively on UNIX and on mini-computers. During the 1980s, C compilers were written for other platforms, including PCs.
To provide a level of standardization for C language, in 1989, ANSI created a standard version of C that is called ANSI C.
One main benefit of the C language is that it is much closer to assembly language other than other types of high-level programming languages.
The programs written in C often run much faster and more efficiently than programs written in other types of high-level programming language.

The C++ Programming Language

C++ is an extension of C that adds object-oriented programming capabilities. C++ is a popular programming language for writing graphical programs that run on Windows and Macintosh.
The standardized version of C++ is commonly referred to as ANSI C++.
The ANSI C and ANSI C++ standards define how C/C++ code can be written.
The ANSI standards also define run-time libraries, which contains useful functions, variables, constants, and other programming items that you can add to your programs.
The ANSI C++ run-time library is also called the Standard Template Library or Standard C++ Library.

Structured Programming and Object Oriented Programming

During the 1960s, many large software development effects encountered severe difficulties. Software schedules were typically late, costs greatly exceeded budgets and finished products were unreliable. People began to realize that software development was a far more complex activity than they had imagined. Research activity in the 1960s resulted in the evolution of structured programming – a discipline approach to writing programs that are clearer than unstructured programs, easier to test and debug and easier to modify. Chapter 5 discusses the principles of structured programming. Chapters 2 through 6 develop many structured programs.
One of the more tangible results of this research was the development of the Pascal programming language by Niklaus Wirth in 1971. Pascal was designed for teaching structured programming in academic environments and rapidly became the preferred programming languages in most universities.
In the 1980s, there is a revolution brewing in the software community: object- oriented programming. Objects are essentially reusable software components that model items in the real world. Software developers are discovering that using a modular, object-oriented design and implementation approach can make software development groups much more productive than with previous popular programming techniques such as structured programming.
Object-oriented programming refers to the creation of reusable software objects that can be easily incorporated into another program. An object is programming code and data that can be treated as an individual unit or component. Data refers to information contained within variables, constants, or other types of storage structures. The procedures associated with an object are referred as functions or methods. Variables that are associated with an object are referred to as properties or attributes. Object-oriented programming allows programmers to use programming objects that they have written themselves or that have been written by others.

Problem Solution and Software Development

No matter what field of work you choose, you may have to solve problems. Many of these can be solved quickly and easily. Still others require considerable planning and forethought if the solution is to be appropriate and efficient.
Creating a program is no different because a program is a solution developed to solve a particular problem. As such, writing a program is almost the last step in a process of first determining what the problem is and the method that will be used to solve the problem.
One technique used by professional software developers for understanding the problem that is being solved and for creating an effective and appropriate software solution is called the software development procedure. The procedure consists of three overlapping phases
- Development and Design
- Documentation
- Maintenance
As a discipline, software engineering is concerned with creating readable, efficient, reliable, and maintainable programs and systems.

Phase I: Development and Design

The first phase consists of four steps:

1. Analyze the problem

This step is required to ensure that the problem is clearly defined and understood. The person doing the analysis has to analyze the problem requirements in order to understand what the program must do, what outputs are required and what inputs are needed. Understanding the problem is very important. Do not start to solve the problem until you have understood clearly the problem.

2. Develop a Solution

Programming is all about solving problems. In this step,
you have to develop an algorithm to solve a given problem. Algorithm is a sequence of steps that describes how the data are to be processed to produce the desired outputs.
An algorithm should be (at least)
–complete (i.e. cover all the parts)
–unambiguous (no doubt about what it does)
–finite (it should finish)

3. Code the solution

This step consists of translating the algorithm into a computer program using a programming language.

4. Test and correct the program

This step requires testing of the completed computer program to ensure that it does, in fact, provide a solution to the problem. Any errors that are found during the tests must be corrected.
Figure below lists the relative amount of effort that is typically expended on each of these four development and design steps in large commercial programming projects.
Figure 4: Four development and design steps in commercial programming projects
Figure 4 (graphics4.png)

Phase II: Documentation

Documentation requires collecting critical documents during the analysis, design, coding, and testing.
There are five documents for every program solution:
  • Program description
  • Algorithm development and changes
  • Well-commented program listing
  • Sample test runs
  • User’s manual

Phase III: Maintenance

This phase is concerned with the ongoing correction of problems, revisions to meet changing needs and the addition of new features. Maintenance is often the major effort, and the longest lasting of the three phases. While development may take days or months, maintenance may continue for years or decades.

Algorithms

An algorithm is defined as a step-by-step sequence of instructions that describes how the data are to be processed to produce the desired outputs. In essence, an algorithm answers the question: “What method will you use to solve the problem?”
You can describe an algorithm by using flowchart symbols. By that way, you obtain a flowchart which is an outline of the basic structure or logic of the program.

Flowchart Symbols

To draw flowchart, we employ the symbols shown in the Figure below.
Figure 5: Flowchart symbols
Figure 5 (graphics5.png)
The meaning of each flowchart symbol is given as follows.
Figure 6: Description of flowchart symbols
Figure 6 (graphics6.png)
To illustrate an algorithm, we consider the simple program that computes the pay of a person. The flowchart for this program is given in the Figure below.
Note: Name, Hours and Pay are variables in the program.
Figure 7: A sample flowchart
Figure 7 (graphics7.png)

Algorithms in pseudo-code

You also can use English-like phases to describe an algorithm. In this case, the description is called pseudocode. Pseudocode is an artificial and informal language that helps programmers develop algorithms. Pseudocode has some ways to represent sequence, decision and repetition in algorithms. A carefully prepared pseudocode can be converted easily to a corresponding C++ program.
Example: The following set of instructions forms a detailed algorithm in pseudocode for calculating the payment of person.
Input the three values into the variables Name, Hours, Rate.
Calculate Pay = Hours  Rate.
Display Name and Pay.

Loops in Algorithms

Many problems require repetition capability, in which the same calculation or sequence of instructions is repeated, over and over, using different sets of data.
Example 1.1. Write a program to do the task: Print a list of the numbers from 4 to 9, next to each number, print the square of the number.
The flowchart for the algorithm that solves this problem is given in Figure below. You will see in this figure the flowchart symbol for decision and the flowline that can connect backward to represent a loop.
Figure 8: Flowcharts of example 1.1
Figure 8 (graphics8.png)
Note:
  1. In the flowchart, the statement
NUM = NUM + 1
means “old value of NUM + 1 becomes new value of NUM ”.
The above algorithm can be described in pseudocode as follows:
NUM = 4
do
SQNUM = NUM*NUM
Print NUM, SQNUM
NUM = NUM + 1
while (NUM <= 9)
You can compare the pseudo-code and the flowchart in Figure above to understand the meaning of the do… while construct used in the pseudo-code.

Flowchart versus pseudocode

Since flowcharts are inconvenient to revise, they have fallen out of favor by programmers. Nowadays, the use of pseudocode has gained increasing acceptance.
Only after an algorithm has been selected and the programmer understands the steps required can the algorithm be written using computer-language statements. The writing of an algorithm using computer-language statements is called coding the algorithm, which is the third step in our program development process.

Parallel Programming with MapReduce

MapReduce Overview

MapReduce is a framework designed by Google [1]. It is loosely based on the Map and Reduce programming constructs of functional languages like Lisp [2]. Google’s MapReduce is used in a distributed computing, scalable platform. It was designed for data-intensive applications which need to process huge amounts of data.
Various open source implementations exist. The most popular one is Apache’s Hadoop [3]. Hadoop is a scalable distributed computing platform that includes a file system (HDFS) to store massive data, and a Java MapReduce implementation to process that data.
Other MapReduce implementations exist. QT Concurrent has a MapReduce implementation for multi-core processors.
This module provides an introduction to parallel programming with MapReduce, using QT Concurrent.

Programming with MapReduce

When we decide to solve a programming problem using MapReduce, we provide an implementation for a mapper and for a reducer. N mappers are executed simultaneously, executing the same task for different input data. One or more reducers receive the output generated by the mappers, and apply a reducing phase yielding a final result.
A simple program to count word occurrences in a text corpus is described in [4]:

   map(String input_key, String input_value):
       // input_key: document name 
       // input_value: document contents 
       for each word w in input_value: 
           EmitIntermediate(w, "1"); 

    reduce(String output_key, Iterator intermediate_values): 
       // output_key: a word 
       // output_values: a list of counts 
       int result = 0; 
    
       for each v in intermediate_values: 
          result += ParseInt(v);
       Emit(AsString(result));
 
MapReduce pseudocode that counts word occurrences in a text corpus.
As shown in the code above, a map function receives an input key and an input value (<key, value>), and generates one or more intermediate output <key,value(s)> pairs. A reduce function receives intermediate keys and values that were the ouput by the mappers, processes them in some way, and generates one or more final key-value(s) pairs (<key,value(s)).
You can find a Hadoop implementation of the word count problem at [5], and a QT concurrent implementation in the QT Concurrent package (path: qtconcurrent/examples/wordcount) which can be checked out with subversion.

MapReduce in QT Concurrent

QT Concurrent [6] is a C++ library for multi-threaded applications. Among other things, it provides a MapReduce implementation for multi-core computers. The map function is called in parallel by multiple threads. The number of threads used in a program depends on the number of processor cores available.
Google’s original MapReduce runs in computer clusters. The data it processes is stored in a distributed file system called GFS (Google File System). To minimize I/O bottlenecks, mappers are usually executed in the same node where the data resides. Parallelism comes from multiple computers executing mappers (or reducers) at the same time.
On the other hand, QT Concurrent’s MapReduce implementation works on shared-memory systems. Parallelism comes from multiple threads executing mappers at the same time, on multiple processor cores.
To work with QT Concurrent’s MapReduce, you write a map function and a reduce function. You must also indicate the list of values (e.g., names of files, words, numbers, etc.) that you want to feed your mappers. To do this, use the following API:

QFuture<T> QtConcurrent::mappedReduced ( 
    const Sequence & sequence, 
    MapFunction mapFunction, 
    ReduceFunction reduceFunction, 
    QtConcurrent::ReduceOptions reduceOptions = UnorderedReduce | SequentialReduce )
Structure QTConcurrent::mappedReduced function
For example:

QFuture<T> mappedReduced(theList, mapFunction, reduceFunction);
QFuture::mappedReduced function

Example: Determining if a (Big) Integer is a Probable Prime

An algorithm for finding a probable prime can be defined using the Miller-Rabin primality test, as follows:
    
    let n be a very big odd number
    check n for divisibility for all primes < 2000
    choose m positive integers less than n
    for each of these bases apply the Miller-Rabin test

    
The algorithm described above can be easily parallelized using MapReduce as follows:
  1. Step 1. Randomly generate an odd (big) integer n greater than 2.
  2. Step 2. Generate (or read a list of) all primes less than 2000.
  3. Step 3. MapReduce-Part1:
    • Map function: Each mapper receives a distinct prime number pi, and checks if n is divisible by this number. A mapper emits (outputs) a 1 if n is divisible by pi or 0 otherwise.
    • Reduce function: Counts the number of divisors. If the number of divisors is greater than zero, it is not a prime number, so the test is terminated.
  4. Step 4. Generate 100 random positive integers less than n.
  5. Step 5. MapReduce-Part2:
    • Map function: Each mapper receives a different random number ri and applies the Miller-Rabin primality test. A mapper emits (outputs) a 1 if n does not pass the test, or a 0 if it passes the test.
    • Reduce function: Counts the number of 1s emitted by the mappers. If the number of divisors is zero, n is a probable prime.
The next section shows a MapReduce implementation of the algorithm described above, using QT Concurrent. You can find a Hadoop implementation here.

A MapReduce Implementation Using QT Concurrent

The main program needs to call the mappedReduced function twice (steps 3 and 5 in the algorithm described earlier):

    qDebug() << "\nJob: Divisibility for primes less than 2000";
    qDebug() << "Starting Job";
    
    time.start();
    Counting final = mappedReduced(primes,mapperLess2000,reducerLess2000);
    mapReduce2000Time = time.elapsed();
    qDebug() << "End Job";
    qDebug() << "MapReduce elapsed time: " << (mapReduce2000Time) << "msegs\n";
    
    //Counter of exacts divisors equals to zero
    if(final[exactDiv] == 0) {

       de = Decomposition<2048>(number - 1);

       qDebug() << "Job: Divisibility for random numbers less than number to evaluate";
       qDebug() << "Starting Job";
       time.start();
       Counting finalRand = mappedReduced(randomPrimes,mapperRandom,reducerRandom);
       mapReduceRandomTime = time.elapsed();
       qDebug() << "End Job";
       qDebug() << "MapReduce elapsed time: " << (mapReduceRandomTime) << "msegs\n";
    
       //Counter of random numbers
       if(finalRand[falseMiller] > 0) {
          qDebug() << "Result: Non-prime";
       } else
          qDebug() << "Result: Probably prime";
    } else qDebug() << "Result: Non-prime";
    
Main program
The Decomposition function used in the main program because number decomposition is needed before the Miller-Rabin test is performed. This means that the number minus one is represented as product of 2k and m, where m is odd.
Two MapReduce processes are implemented. The first MapReduce process, evaluates divisibility for all primes less than 2000. Each map function receives a prime as argument and the reduce function counts the number of divisors, as shown below:

    Counting mapperLess2000(const unsigned long &prime) {
       Counting rMap; 
       BigInt2048 myPrime(prime);
    
       if( number % myPrime == 0 )
          divisors << prime; 
    
       rMap[( number % myPrime == 0 )?exactDiv:inexactDiv];
       return rMap;
    }
    
    void reducerLess2000(Counting &result, const Counting &w) {
        QMapIterator<QString, int> i(w); 
    
        while(i.hasNext()) {
           i.next();
           result[i.key()]++;
        }
    }
MapReduce code to test for primality
The second MapReduce process is executed only if the random number was not divisible by any of the prime numbers less than 2000:

    Counting mapperRandom(const BigInt2048 &prime) {
        
        Counting rMap;
        BigInt2048 z( getZ(prime,de.getM()).bits );
    
        // This is the easy case; the first term in the sequence
        // is correct, so we pass the test.
    
        if(z == 1) {

           rMap[trueMiller];
           divisors << z;
           return rMap;

        } else {
    
           for (int j = 0; j < de.getK(); j++) {
               BigInt2048 zSquared = BigInt2048(newZ(z).bits);
           
               if (zSquared == 1 && z == (number - 1) ) {
    
                  // We've passed the hard version of the Rabin Miller test.
                  rMap[trueMiller];
                  divisors << z;
                  return rMap;

               }
           
               z = zSquared;
           }
        
           rMap[falseMiller];
           return rMap;
        }
    }
    
    
    void reducerRandom(Counting &result, const Counting &w) {

        QMapIterator<QString, int> i(w);
        while(i.hasNext()) {
            i.next();
            result[i.key()]++;
        }
    }
    
MapReduce code that applies the Miller-Rabin test
Figure 1 shows the results of one run of the program.
Figure 1: Prime Validator program
Figure 1 (graphics1.png)
You can download the complete source of this example here.

References

Related Posts Plugin for WordPress, Blogger...

java

Popular java Topics