There are many ways to store data. So far, we have covered linear recursive structures, lists, and binary recursive structures, trees. Let's consider another way of storing data, as a contiguous, numbered (indexed) set of data storage elements:
anArray =
itemA | itemB | itemC | itemD | itemE | itemF | itemG | itemH | itemI | itemJ |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
This "array" of elements allows us to access any individual element using a numbered index value.
- DEFINITION 1: array
- At its most basic form, a random access data structure where any element can be accessed by specifying a single index value corresponding to that element.
EXAMPLE
anArray[4]
gives usitemE
. Likewise, the statementanArray[7] = 42
should replaceitemH
with42
.
NOTE:
Notice however, that the above definition is not a recursive definition. This will cause problems.Arrays in Java
- Arrays...
- are contiguous (in memory) sets of object references (or values, for primitives),
- are objects,
- are dynamically created (via
new
), and - may be assigned to variables of type
Object
or primitives
- An array object contains zero or more unnamed variables of the same type. These variables are commonly called the elements of the array.
- A non-negative integer is used to name each element. For example,
arrayOfInts[i]
refers to the i+1st element in thearrayOfInts
array. In computer-ese, an array is said to be a "random access" container, because you can directly (and I suppose, randomly) access any element in the array. - An array has a limited amount of intelligence, for instance, it does know its maximum length at all times, e.g.
arrayOfInts.length
. - Arrays have the advantage that they
- provide random access to any element
- are fast.
- require minimum amounts of memory
More information on arrays can be found in the Java Resources web site page on arrays
REMEMBER:
Arrays are size and speed at a price.Array Types
- An array type is written as the name of an element type followed by one or more empty pairs of square brackets.
- For example,
int[]
is the type corresponding to a one-dimensional array of integers.
- For example,
- An array's length is not part of its type.
- The element type of an array may be any type, whether primitive or reference, including interface types and abstract class types.
Array Variables
- Array variables are declared like other variables: a declaration consists of the array's type followed by the array's name. For example,
double[][] matrixOfDoubles;
declares a variable whose type is a two-dimensional array of double-precision floating-point numbers. - Declaring a variable of array type does not create an array object. It only creates the variable, which can contain a reference to an array.
- Because an array's length is not part of its type, a single variable of array type may contain references to arrays of different lengths.
- To complicate declarations, C/C++-like syntax is also supported, for example,This declaration is equivalent to
double rowvector[], colvector[], matrix[][];
ordouble[] rowvector, colvector, matrix[];
Please use the latter!double[] rowvector, colvector; double[][] matrix;
Array Creation
- Array objects, like other objects, are created with
new
. For example,String[] arrayOfStrings = new String[10];
declares a variable whose type is an array of strings, and initializes it to hold a reference to an array object with room for ten references to strings. - Another way to initialize array variables is
int[] arrayOf1To5 = { 1, 2, 3, 4, 5 }; String[] arrayOfStrings = { "array", "of", "String" }; Widget[] arrayOfWidgets = { new Widget(), new Widget() };
- Once an array object is created, it never changes length!
int[][] arrayOfArrayOfInt = {{ 1, 2 }, { 3, 4 }};
- The array's length is available as a final instance variable length. For example,would print ``5''.
int[] arrayOf1To5 = { 1, 2, 3, 4, 5 }; System.out.println(arrayOf1To5.length);
Array Accesses
- Indices for arrays must be
int
values that are greater than or equal to 0 and less than the length of the array. Remember that computer scientists always count starting at zero, not one! - All array accesses are checked at run time: An attempt to use an index that is less than zero or greater than or equal to the length of the array causes an
IndexOutOfBoundsException
to be thrown. - Array elements can be used on either side of an equals sign:
myArray[i] = aValue;
someValue = myArray[j];
- Accessing elements of an array is fast and the time to access any element is independent of where it is in the array.
- Inserting elements into an array is very slow because all the other elements following the insertion point have to be moved to make room, if that is even possible.
Array Processing Using Loops
More information on loops can be found at the Java Resources web site page on loops.
The main technique used to process arrays is the for loop. A
for
loop is a way of processing each element of the array in a sequential manner.Here is a typical
for
loop:
// Sum the number of elements in an array of ints, myArray.
int sum = 0; // initialize the sum
for(int idx=0; idx < myArray.length; idx++) { //start idx @ 0; end idx at length-1;
//increment idx every time the loop is processed.
sum += myArray[idx]; // add the idx'th element of myArray to the sum
}
There are a number of things going on in the above
for
loop:- Before the loop starts, the index
idx
is being declared and initialized to zero.idx
is visible only within thefor
loop body (between the curly braces). - At the begnning of each loop iteration, the index
idx
is being tested in a "termination condition", in this case,idx
is compared to the length of the list. If the termination condition evaluates tofalse
, the loop will immediately terminate. - During each loop iteration, the value of the
idx
's element in the array is being added to the running sum. - After each loop iteration, the index
idx
is being incremented.
One can traverse an array in any direction one wishes:
// Sum the number of elements in an array of ints, myArray.
int sum = 0; // initialize the sum
for(int idx=myArray.length-1; 0<=idx; idx--) { //start idx @ length-1; end idx at 0;
//decrement idx every time the loop is processed.
sum += myArray[idx]; // add the idx'th element of myArray to the sum
}
The above loop sums the list just as well as the first example, but it does it from back to front. Note however, that we had to be a little careful on how we initialized the index and how we set up the termination condition.
Here's a little more complicated loop:
// Find the index of the smallest element in an array of ints, myArray.
int minIdx = 0; // initialize the index. Must be declared outside the loop.
if(0==myArray.length) throw new NoSuchElementException("Empty array!"); // no good if array is empty!
else {
for(minIdx = 0, int j = 1; j<myArray.length; j++) { //start minIdx @ 0, start index @ 1 ;
//end index at length-1; increment index every time the loop is processed.
if(myArray[minIdx] > myArray[j])
minIdx = j; // found new minimum
}
}
Some important things to notice about this algorithm:
- The empty case must be checked explicitly — no polymorphism to help you out here!
- The desired result index cannot be declared inside the for loop because otherwise it won't be visible to the outside world.
- Be careful about using the
minIdx
value if the array was indeed empty--it's an invalid value! It can't be set to a valid value because otherwise you can't tell the difference between a value that was never set and one that was. - The
for
loop has two initialization statements separated by a comma. - The loop does work correctly if the array only has one element, but only because the termination check is done before the loop body.
- Notice that to prove that this algorithm works properly, one must make separate arguments about the empty case, the one element case and the n-element case. Contrast this to the much simpler list algorithm that only needs an empty and non-empty cases.
For convenience, Java 5.0 now offers a compact syntax used for traversing all the elements of an array or of anything that subclasses type
Iterable
:
MyType[] myArray; // array is initialized with data somewhere
for(MyType x: myArray){
// code involving x, i.e. each element in the array
}
It is important to remember that this syntax is used when one wants to process every element in an array (or an
Iterable
object) independent of order of processing because Java does not guarantee a traversal order.Let's look at an algorithm where we might not want to process the entire array:
// Find the first index of a given value in an array
int idx = -1; // initialize the index to an invalid value.
for(int j=0; j<myArray.length; j++) { //no initialization ; end index at length-1;
//increment index every time the loop is processed.
if(desiredValue == myArray[j]) { // found match!
idx = j; // save the index.
break; // break out of the loop.
}
}
Notes:
- The only way you can tell if the desired value was actually found or not is if the value of
idx
is -1 or not. Thus the value ofidx
must be checked before it is ever used. - The resultant
idx
variable cannot be used as the index inside the loop because one would not be able to tell if the desired value was found or not unless one also checked the length of the array. This is because if the desired value was never found,idx
at the end of the loop would equal the length of the array, which is only an invalid value if you already know the length of the array. - The
break
statement stops the loop right away and execution resumes at the point right after end of the loop.
There is a counterpart to
break
called continue
that causes the loop to immediately progress to the beginning of the next iteration. It is used much more rarely than break
, usually in situations where there are convoluted and deeply nested if
-else
statements.Can you see that the price of the compact syntax of for loops is a clear understandability of the process?
While loops
for
loops are actually a specialized version of while
loops. while
loops have no special provisions for initialization or loop incrementing, just the termination condition.while
loops iterate through the loop body until the termination condition evaluates to a false
value.The following
for
loop:
for([initialization statement]; [termination expr] ; [increment statement]) {
[loop body]
}
Is exactly equivalent to the following:
{
[initialization statement];
while([termination expr]) {
[loop body]
[increment statement];
}
}
Note the outermost curly braces that create the scoping boundary that encapsulates any variable declared inside the
for
loop.The Java compiler will automatically convert a
for
loop to the above while
loop.Here is the above algorithm that finds a desired value in an array, translated from a
for
loop to a while
loop:
// Find the index of the first occurance of desiredValue in myArray, using a while loop.
{
idx = -1; // initialize the final result
int j = 0; // initialize the index
while(j < myArray.length) { // loop through the array
if(desiredValue == myArray[j]) { // check if found the value
idx = j; // save the index
break; // exit the loop.
}
j++; // increment the index
}
}
Basically,
for
loops give up some of the flexibility of a while
loop in favor of a more compact syntax.while
loops are very useful when the data is not sequentially accessible via some sort of index. Another useful scenario for while
loops is when the algorithm is waiting for something to happen or for a value to come into the system from an outside (relatively) source.do
-while
loops are the same as while
loops except that the conditional statement is evaluated at the end of the loop body, not its beginning as in a for
or while
loop.See the Java Resources web site page on loops for more information on processing lists using while loops.
for-each loops
An exceedingly common
for
-loop to write is the following;
Stuff[] s_array = new Stuff[n];
// fill s_array with values
for(int i = 0; i < s_array.length; i++) {
// do something with s_array[i]
}
Essentially, the loop does some invariant processing on every element of the array.
To make life easier, Java implements the for-each loop, which is just an alternate
for
loop syntax:
Stuff[] s_array = new Stuff[n];
// fill s_array with values
for(Stuff s:s_array) {
// do something with s
}
Simpler, eh?
It turns out that the for-each loop is not simply relegated to array. Any class that implements the
Iterable
interface will work. This is discussed in another module, as it involves the use of generics.Arrays vs. Lists
In no particular order...
- Arrays:
- Fast access to all elements.
- Fixed number of elements held.
- Difficult to insert elements.
- Can run into problems with uninitialized elements.
- Minimal safety for out-of-bounds indices.
- Minimal memory used
- Simple syntax
- Must use procedural techniques for processing.
- Often incompatible with OO architectures.
- Difficult to prove that processing algorithms are correct.
- Processing algorithms can be very fast.
- Processing algorithms can be minimally memory intensive
- Lists:
- Slow access except to first element, which is fast.
- Unlimited number of elements held.
- Easy to insert elements.
- Encountering uninitialized elements very rare to impossible.
- Impossible to make out-of-bounds errors.
- Not optimized for memory usage.
- More cumbersome syntax.
- Can use OO and polymorphic recursive techniques for processing.
- Very compatible with OO architectures.
- Easy to prove that processing algorithms are correct.
- Processing algorithms can be quite fast if tail-recursive and using a tail-call optimizing compiler.
- Processing algorithms can be very memory intensive unless tail-recursive and using a tail-call optimizing compiler.
No comments:
Post a Comment