Friday, September 7, 2007

Fundamental Concepts - OOP

Object-oriented programming (OOP) is a programming paradigm that uses "objects" and their interactions to design applications and computer programs. It is based on several techniques, including inheritance, modularity, polymorphism, and encapsulation.


Class
A class defines the abstract characteristics of a thing (object), including the thing's characteristics (its attributes, fields or properties) and the thing's behaviors (the things it can do or methods or features). For example, the class Dog would consist of traits shared by all dogs, such as breed and fur color (characteristics), and the ability to bark (behavior). Classes provide modularity and structure in an object-oriented computer program. A class should typically be recognizable to a non-programmer familiar with the problem domain, meaning that the characteristics of the class should make sense in context. Also, the code for a class should be relatively self-contained. Collectively, the properties and methods defined by a class are called members.


Object
A particular instance of a class. The class of Dog defines all possible dogs by listing the characteristics and behaviors they can have; the object Lassie is one particular dog, with particular versions of the characteristics. A Dog has fur; Lassie has brown-and-white fur. In programmer jargon, the object Lassie is an instance of the Dog class. The set of values of the attributes of a particular object is called its state.The object consists of state and the behaviour that's defined in the object's class.


Method
An object's abilities. Lassie, being a Dog, has the ability to bark. So bark() is one of Lassie's methods. She may have other methods as well, for example sit() or eat(). Within the program, using a method should only affect one particular object; all Dogs can bark, but you need one particular dog to do the barking.


Message passing
"The process by which an object sends data to another object or asks the other object to invoke a method." Also known to some programming languages as interfacing


Inheritance
"Subclasses" are more specialized versions of a class, which inherit attributes and behaviors from their parent classes, and can introduce their own.
For example, the class Dog might have sub-classes called Collie, Chihuahua, and GoldenRetriever. In this case, Lassie would be an instance of the Collie subclass. Suppose the Dog class defines a method called bark() and a property called furColor. Each of its sub-classes (Collie, Chihuahua, and GoldenRetriever) will inherit these members, meaning that the programmer only needs to write the code for them once.
Each subclass can alter its inherited traits. For example, the Collie class might specify that the default furColor for a collie is brown-and-white. The Chihuahua subclass might specify that the bark() method produces a high-pitched by default. Subclasses can also add new members. The Chihuahua subclass could add a method called tremble(). So an individual chihuahua instance would use a high-pitched bark() from the Chihuahua subclass, which in turn inherited the usual bark() from Dog. The chihuahua object would also have the tremble() method, but Lassie would not, because she is a Collie, not a Chihuahua. In fact, inheritance is an "is-a" relationship: Lassie is a Collie. A Collie is a Dog. Thus, Lassie inherits the members of both Collies and Dogs.
Multiple inheritance is inheritance from more than one ancestor class, neither of these ancestors being an ancestor of the other. For example, independent classes could define Dogs and Cats, and a Chimera object could be created from these two which inherits all the (multiple) behavior of cats and dogs. This is not always supported, as it can be hard both to implement and to use well.



Encapsulation
Encapsulation conceals the functional details of a class from objects that send messages to it.
For example, the Dog class has a bark() method. The code for the bark() method defines exactly how a bark happens (e.g., by inhale() and then exhale(), at a particular pitch and volume). Timmy, Lassie's friend, however, does not need to know exactly how she barks. Encapsulation is achieved by specifying which classes may use the members of an object. The result is that each object exposes to any class a certain interface — those members accessible to that class. The reason for encapsulation is to prevent clients of an interface from depending on those parts of the implementation that are likely to change in future, thereby allowing those changes to be made more easily, that is, without changes to clients. For example, an interface can ensure that puppies can only be added to an object of the class Dog by code in that class. Members are often specified as public, protected or private, determining whether they are available to all classes, sub-classes or only the defining class. Some languages go further: Java uses the default access modifier to restrict access also to classes in the same package, C# and VB.NET reserve some members to classes in the same assembly using keywords internal (C#) or Friend (VB.NET), and Eiffel and C++ allows one to specify which classes may access any member.



Abstraction
Abstraction is simplifying complex reality by modeling classes appropriate to the problem, and working at the most appropriate level of inheritance for a given aspect of the problem.
For example, Lassie the Dog may be treated as a Dog much of the time, a Collie when necessary to access Collie-specific attributes or behaviors, and as an Animal (perhaps the parent class of Dog) when counting Timmy's pets.Abstraction is also achieved through Composition. For example, a class Car would be made up of an Engine, Gearbox, Steering objects, and many more components. To build the Car class, one does not need to know how the different components work internally, but only how to interface with them, i.e., send messages to them, receive messages from them, and perhaps make the different objects composing the class interact with each other.



Polymorphism
Polymorphism allows you to treat derived class members just like their parent class's members. More precisely, Polymorphism in object-oriented programming is the ability of objects belonging to different data types to respond to method calls of methods of the same name, each one according to an appropriate type-specific behavior. One method, or an operator such as +, -, or *, can be abstractly applied in many different situations. If a Dog is commanded to speak(), this may elicit a Bark. However, if a Pig is commanded to speak(), this may elicit an Oink. They both inherit speak() from Animal, but their derived class methods override the methods of the parent class; this is Overriding Polymorphism. Overloading Polymorphism is the use of one method signature, or one operator such as "+", to perform several different functions depending on the implementation. The "+" operator, for example, may be used to perform integer addition, float addition, list concatenation, or string concatenation. Any two subclasses of Number, such as Integer and Double, are expected to add together properly in an OOP language. The language must therefore overload the concatenation operator, "+", to work this way. This helps improve code readability. How this is implemented varies from language to language, but most OOP languages support at least some level of overloading polymorphism. Many OOP languages also support Parametric Polymorphism, where code is written without mention of any specific type and thus can be used transparently with any number of new types. Pointers are an example of a simple polymorphic routine that can be used with many different types of objects.



Not all of the above concepts are to be found in all object-oriented programming languages, and so object-oriented programming that uses classes is called sometimes class-based programming. In particular, prototype-based programming does not typically use classes. As a result, a significantly different yet analogous terminology is used to define the concepts of object and instance, although there are no objects in these languages.

Tuesday, September 4, 2007

Differences Between C and Java

If you are a C or C++ programmer, you should have found much of the syntax of Java--particularly at the level of operators and statements--to be familiar. Because Java and C are so similar in some ways, it is important for C and C++ programmers to understand where the similarities end. There are a number of important differences between C and Java, which are summarized in the following list:

No preprocessor

Java does not include a preprocessor and does not define any analogs of the #define, #include, and #ifdef directives. Constant definitions are replaced with staticfinal fields in Java. (See the java.lang.Math.PI field for an example.) Macro definitions are not available in Java, but advanced compiler technology and inlining has made them less useful. Java does not require an #include directive because Java has no header files. Java class files contain both the class API and the class implementation, and the compiler reads API information from class files as necessary. Java lacks any form of conditional compilation, but its cross-platform portability means that this feature is very rarely needed.

No global variables

Java defines a very clean namespace. Packages contain classes, classes contain fields and methods, and methods contain local variables. But there are no global variables in Java, and, thus, there is no possibility of namespace collisions among those variables.

Well-defined primitive type sizes

All the primitive types in Java have well-defined sizes. In C, the size of short, int, and long types is platform-dependent, which hampers portability.

No pointers

Java classes and arrays are reference types, and references to objects and arrays are akin to pointers in C. Unlike C pointers, however, references in Java are entirely opaque. There is no way to convert a reference to a primitive type, and a reference cannot be incremented or decremented. There is no address-of operator like &, dereference operator like * or −>, or sizeof operator. Pointers are a notorious source of bugs. Eliminating them simplifies the language and makes Java programs more robust and secure.

Garbage collection

The Java Virtual Machine performs garbage collection so that Java programmers do not have to explicitly manage the memory used by all objects and arrays. This feature eliminates another entire category of common bugs and all but eliminates memory leaks from Java programs.

No goto statement

Java doesn't support a goto statement. Use of goto except in certain well-defined circumstances is regarded as poor programming practice. Java adds exception handling and labeled break and continue statements to the flow-control statements offered by C. These are a good substitute for goto.

Variable declarations anywhere

C requires local variable declarations to be made at the beginning of a method or block, while Java allows them anywhere in a method or block. Many programmers prefer to keep all their variable declarations grouped together at the top of a method, however.

Forward references

The Java compiler is smarter than the C compiler, in that it allows methods to be invoked before they are defined. This eliminates the need to declare functions in a header file before defining them in a program file, as is done in C.

Method overloading

Java programs can define multiple methods with the same name, as long as the methods have different parameter lists.

No struct and union types

Java doesn't support C struct and union types. A Java class can be thought of as an enhanced struct, however.

No enumerated types

Java doesn't support the enum keyword used in C to define types that consist of fixed sets of named values. This is surprising for a strongly typed language like Java, but there are ways to simulate this feature with object constants.

No bitfields

Java doesn't support the (infrequently used) ability of C to specify the number of individual bits occupied by fields of a struct.

No typedef

Java doesn't support the typedef keyword used in C to define aliases for type names. Java's lack of pointers makes its type-naming scheme simpler and more consistent than C's, however, so many of the common uses of typedef are not really necessary in Java.

No method pointers

C allows you to store the address of a function in a variable and pass this function pointer to other functions. You cannot do this with Java methods, but you can often achieve similar results by passing an object that implements a particular interface. Also, a Java method can be represented and invoked through a java.lang.reflect.Method object.

No variable-length argument lists

Java doesn't allow you to define methods such as C's printf() that take a variable number of arguments. Method overloading allows you to simulate C varargs functions for simple cases, but there's no general replacement for this feature.

Differences Between Java and C/C++


  • The Preprocessor
  • Pointers
  • Structures and Unions
  • Functions
  • Multiple Inheritance
  • Strings
  • The goto Statement
  • Operator Overloading
  • Automatic Coercions
  • Variable Arguments
  • Command-Line Arguments

It is no secret that the Java language is highly derived from the C and C++ languages. Because C++ is currently considered the language of choice for professional software developers, it's important to understand what aspects of C++ Java inherits. Of possibly even more importance are what aspects of C++ Java doesn't support. Because Java is an entirely new language, it was possible for the language architects to pick and choose which features from C++ to implement in Java and how.

The focus of this appendix is to point out the differences between Java and C++. If you are a C++ programmer, you will be able to appreciate the differences between Java and C++. Even if you don't have any C++ experience, you can gain some insight into the Java language by understanding what C++ discrepancies it clears up in its implementation. Because C++ backwardly supports C, many of the differences pointed out in this appendix refer to C++, but inherently apply to C as well.

The Preprocessor

All C/C++ compilers implement a stage of compilation known as the preprocessor. The C++ preprocessor basically performs an intelligent search and replace on identifiers that have been declared using the #define or #typedef directives. Although most advocates of C++ discourage the use of the preprocessor, which was inherited from C, it is still widely used by most C++ programmers. Most of the processor definitions in C++ are stored in header files, which complement the actual source code files.

The problem with the preprocessor approach is that it provides an easy way for programmers to inadvertently add unnecessary complexity to a program. What happens is that many programmers using the #define and #typedef directives end up inventing their own sublanguage within the confines of a particular project. This results in other programmers having to go through the header files and sort out all the #define and #typedef information to understand a program, which makes code maintenance and reuse almost impossible. An additional problem with the preprocessor approach is that it is weak when it comes to type checking and validation.

Java does not have a preprocessor. It provides similar functionality (#define, #typedef, and so on) to that provided by the C++ preprocessor, but with far more control. Constant data members are used in place of the #define directive, and class definitions are used in lieu of the #typedef directive. The result is that Java source code is much more consistent and easier to read than C++ source code. Additionally, Java programs don't use header files; the Java compiler builds class definitions directly from the source code files, which contain both class definitions and method implementations.

Pointers

Most developers agree that the misuse of pointers causes the majority of bugs in C/C++ programming. Put simply, when you have pointers, you have the ability to trash memory. C++ programmers regularly use complex pointer arithmetic to create and maintain dynamic data structures. In return, C++ programmers spend a lot of time hunting down complex bugs caused by their complex pointer arithmetic.

The Java language does not support pointers. Java provides similar functionality by making heavy use of references. Java passes all arrays and objects by reference. This approach prevents common errors due to pointer mismanagement. It also makes programming easier in a lot of ways simply because the correct usage of pointers is easily misunderstood by all but the most seasoned programmers.

You may be thinking that the lack of pointers in Java will keep you from being able to implement many data structures, such as dynamic arrays. The reality is that any pointer task can be carried out just as easily and more reliably with objects and arrays of objects. You then benefit from the security provided by the Java runtime system; it performs boundary checking on all array indexing operations.

Structures and Unions

There are three types of complex data types in C++: classes, structures, and unions. Java only implements one of these data types: classes. Java forces programmers to use classes when the functionality of structures and unions is desired. Although this sounds like more work for the programmer, it actually ends up being more consistent, because classes can imitate structures and unions with ease. The Java designers really wanted to keep the language simple, so it only made sense to eliminate aspects of the language that overlapped.

Functions

In C, code is organized into functions, which are global subroutines accessible to a program. C++ added classes and in doing so provided class methods, which are functions that are connected to classes. C++ class methods are very similar to Java class methods. However, because C++ still supports C, there is nothing discouraging C++ programmers from using functions. This results in a mixture of function and method use that makes for confusing programs.

Java has no functions. Being a purer object-oriented language than C++, Java forces programmers to bundle all routines into class methods. There is no limitation imposed by forcing programmers to use methods instead of functions. As a matter of fact, implementing routines as methods encourages programmers to organize code better. Keep in mind that strictly speaking there is nothing wrong with the procedural approach of using functions; it just doesn't mix well with the object-oriented paradigm that defines the core of Java.

Multiple Inheritance

Multiple inheritance is a feature of C++ that allows you to derive a class from multiple parent classes. Although multiple inheritance is indeed powerful, it is complicated to use correctly and causes many problems otherwise. It is also very complicated to implement from the compiler perspective.

Java takes the high road and provides no direct support for multiple inheritance. You can implement functionality similar to multiple inheritance by using interfaces in Java. Java interfaces provide object method descriptions but contain no implementations.

Strings

C and C++ have no built-in support for text strings. The standard technique adopted among C and C++ programmers is that of using null-terminated arrays of characters to represent strings.

In Java, strings are implemented as first class objects (String and StringBuffer), meaning that they are at the core of the Java language. Java's implementation of strings as objects provides several advantages:

  • The manner in which you create strings and access the elements of strings is consistent across all strings on all systems
  • Because the Java string classes are defined as part of the Java language and not part of some extraneous extension, Java strings function predictably every time
  • The Java string classes perform extensive runtime checking, which helps eliminate troublesome runtime errors

The goto Statement

The dreaded goto statement is pretty much a relic these days even in C and C++, but it is technically a legal part of the languages. The goto statement has historically been cited as the cause for messy, impossible to understand, and sometimes even impossible to predict code known as "spaghetti code." The primary usage of the goto statement has merely been as a convenience to substitute not thinking through an alternative, more structured branching technique.

For all these reasons and more, Java does not provide a goto statement. The Java language specifies goto as a keyword, but its usage is not supported. I suppose the Java designers wanted to eliminate the possibility of even using goto as an identifier! Not including goto in the Java language simplifies the language and helps eliminate the option of writing messy code.

Operator Overloading

Operator overloading, which is considered a prominent feature in C++, is not supported in Java. Although roughly the same functionality can be implemented by classes in Java, the convenience of operator overloading is still missing. However, in defense of Java, operator overloading can sometimes get very tricky. No doubt the Java developers decided not to support operator overloading to keep the Java language as simple as possible.

Automatic Coercions

Automatic coercion refers to the implicit casting of data types that sometimes occurs in C and C++. For example, in C++ you can assign a float value to an int variable, which can result in a loss of information. Java does not support C++ style automatic coercions. In Java, if a coercion will result in a loss of data, you must always explicitly cast the data element to the new type.

Variable Arguments

C and C++ let you declare functions, such as printf, that take a variable number of arguments. Although this is a convenient feature, it is impossible for the compiler to thoroughly type check the arguments, which means problems can arise at runtime without you knowing. Again Java takes the high road and doesn't support variable arguments at all.

Command-Line Arguments

The command-line arguments passed from the system into a Java program differ in a couple of ways from the command-line arguments passed into a C++ program. First, the number of parameters passed differs between the two languages. In C and C++, the system passes two arguments to a program: argc and argv. argc specifies the number of arguments stored in argv. argv is a pointer to an array of characters containing the actual arguments. In Java, the system passes a single value to a program: args. args is an array of Strings that contains the command-line arguments. In C and C++, the command-line arguments passed into a program include the name used to invoke the program. This name always appears as the first argument and is rarely ever used. In Java, you already know the name of the program because it is the same name as the class, so there is no need to pass this information as a command-line argument. Therefore, the Java runtime system only passes the arguments following the name that invoked the program.