Archive for the 'object oriented' Category

Implementing a deep clone using private copy constructors in Java

Thursday, July 5th, 2007

One of the drawbacks of Java is that there is no quick-and-easy way to copy an object into a new memory location. The Object class has a clone() method, but this method will not perform a complete memory location copy of that object. If the object being cloned has any references to other objects within it, those references will be carried over to the clone. That means that changes to the clone will result in changes to the original object, and this is not usually what is wanted with an object is cloned.

One method of creating a “deep clone” of a Java Object is to serialize it to XML, then read it back into the program from the XML stream into a new object (serializing to disk also works fine, but takes longer). I may go into detail about this method in a future post. This is usually resorted to when the object structure that needs a deep copy is legacy in nature, and there were no conveniences for copying included in the design.

The method that I’m going to detail in this post will add copy constructors to each object that will need to be cloned within the object model.

For this example, let’s say I have a Data object that contains:

  • one Double (the double wrapper class)
  • one Integer (the int wrapper class)
  • one String (duh)
  • one Person (a custom data class)

Here is the class, with all its mutator methods. Note the private constructor and the getCopy() method that uses it.

Data.java

package net.dangertree.copyconstructors;

public class Data {
	private Double doubleValue;
	private Integer integerValue;
	private String stringValue;
	private Person person;

	public Data(Double d, Integer i, String s, Person person) {
		super();
		this.doubleValue = d;
		this.integerValue = i;
		this.stringValue = s;
		this.person = person;
	}

	/*
	 * This method must ensure that all inner data elements
	 * are created into new object references.
	 */

	private Data(Data toCopy) {
		this.doubleValue = new Double(toCopy.getDoubleValue());
		this.integerValue = new Integer(toCopy.getIntegerValue());
		this.stringValue = new String(toCopy.getStringValue());
		// we rely on 'Person' to create its own copy
		this.person = toCopy.getPerson().getCopy();
	}


	public Data getCopy() {

		return new Data(this);

	}

	public Double getDoubleValue() {
		return doubleValue;
	}

	public void setDoubleValue(Double d) {
		this.doubleValue = d;
	}

	public Integer getIntegerValue() {
		return integerValue;
	}

	public void setIntegerValue(Integer i) {
		this.integerValue = i;
	}

	public String getStringValue() {
		return stringValue;
	}

	public void setStringValue(String s) {
		this.stringValue = s;
	}

	public Person getPerson() {
		return person;
	}

	public void setPerson(Person person) {
		this.person = person;
	}

	@Override
	public String toString() {
		return "Double valuet" + doubleValue + "nInteger valuet"
				+ integerValue + "nString valuet" + stringValue
				+ "nPersont" + person.toString() + "n";
	}
}

There is a private constructor that actually creates a copy, while the getCopy() method is able to call the private constructor, passing in the this keyword, meaning to make a copy of the object that getCopy() is being called on. The job of the private constructor is to guarantee that each element that needs to be created for the copy is pointing to a new object in memory, but has the same values as the parameter Data being passed in. Double, Integer, and String all provide public constructors, making it easy for us to create new objects with new memory locations. But for a custom data class like Person, we need to ensure that it is copyable as well.

It would be logical to create an Copyable interface that provides a getCopy() method within its contract. Then both the Data and Person could implement it.

The Person class must also be able to provide a copy of itself, and it can use the same method as Data.

Person.java

package net.dangertree.copyconstructors;

public class Person {
	private String name;
	private int age;

	public Person(String name, int age) {
		super();
		this.name = name;
		this.age = age;
	}

	private Person(Person toCopy) {
		this.name = new String(toCopy.getName());
		this.age = toCopy.getAge();
	}


	public Person getCopy() {

		return new Person(this);

	}

	public int getAge() {
		return age;
	}
	public void setAge(int age) {
		this.age = age;
	}
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}

	@Override
	public String toString() {
		return name + ", " + age;
	}

	@Override
	public boolean equals(Object obj) {
		if(this == obj)
			return true;
		if((obj == null) || (obj.getClass() != this.getClass()))
			return false;
		Person compareTo = (Person)obj;
		if (compareTo.getAge() == age
				&& compareTo.getName().equals(name)) {
			return true;
		}
		return false;
	}

	@Override
	public int hashCode() {
		int hash = 1;
	    hash = hash * 31 + age;
	    hash = hash * 31
	                + (name == null ? 0 : name.hashCode());
	    return hash;
	}
}
If you override the equals() method, always override the hashCode() method as well!

Now if we provide a class to put some collections together and show how the deep copying works…

DataCopier.java

package net.dangertree.copyconstructors;

public class DataCopier {

	public static void main(String[] args) {
		// create data from scratch
		Data d1 = new Data(1.1, 2, "three", new Person("Matt Taylor", 28));

		System.out.println("Data 1: ");
		System.out.println(d1);

		// make a copy of the data to toy with
		Data d2 = d1.getCopy();
		System.out.println("Deep copy of Data 1 before changes:");
		System.out.println(d2);

		// now change the copy a little
		d2.setDoubleValue(3.14);
		d2.setIntegerValue(500);
		d2.setStringValue("changed copy");
		d2.getPerson().setAge(29);

		System.out.println("Deep copy of Data 1 after changes:");
		System.out.println(d2);

		System.out.println("The original, unchanged Data 1:");
		System.out.println(d1);
	}
}

If you run this, it produces the following output:

Data 1:
Double value 1.1
Integer value 2
String value three
Person Matt Taylor, 28

Deep copy of Data 1 before changes:
Double value 1.1
Integer value 2
String value three
Person Matt Taylor, 28

Deep copy of Data 1 after changes:
Double value 3.14
Integer value 500
String value changed copy
Person Matt Taylor, 29

The original, unchanged Data 1:
Double value 1.1
Integer value 2
String value three
Person Matt Taylor, 28

As you can see, the original copy is unchanged, even after its copy is altered. That is because all its references are pointing to different memory locations in the heap. This is what we want to happen with a deep copy.

Here is also a unit test that proves this:

CopyTest.java

package test.net.dangertree.copyconstructors;

import junit.framework.TestCase;
import net.dangertree.copyconstructors.Data;
import net.dangertree.copyconstructors.Person;

public class CopyTest extends TestCase {

	public void test_data_is_deep_copied() throws Exception {
		Data d1 = new Data(1.1, 2, "three", new Person("Matt Taylor", 28));
		Data d2 = d1.getCopy();

		// checks that the variables do not reference the same object
		assertNotSame(d1, d2);

		// checks that the objects contain the same data values
		assertEquals(d1.getDoubleValue(), d2.getDoubleValue());
		assertEquals(d1.getIntegerValue(), d2.getIntegerValue());
		assertEquals(d1.getStringValue(), d2.getStringValue());
		assertEquals(d1.getPerson(), d2.getPerson());

		// checks that the variables within the data do not reference
		// the same objects
		assertNotSame(d1.getDoubleValue(), d2.getDoubleValue());
		assertNotSame(d1.getIntegerValue(), d2.getIntegerValue());
		assertNotSame(d1.getStringValue(), d2.getStringValue());
		assertNotSame(d1.getPerson(), d2.getPerson());

	}

}

Finally, some things to remember:

  • Do not provide a public copy constructor. For more information on why, here is a good article.
  • The private constructor must ensure that all attributes are recreated from the incoming parameter to copy with a reference to a different memory location. For primitives, this happens automatically, but for objects you have to make sure.
  • All attribute objects must be easily copied if you want to make an object copyable. For this example, the primitive wrapper classes (Double and Integer) and String make it easy for you to create a copy with a public constructor that takes primitives or a immutable string. For your custom classes, you can provide a getCopy() and a matching private copy constructor to do the same job.
  • The Object.clone() method does not provide a deep copy. It will create a new object in memory, but all its attributes will point to the cloned object’s memory locations. So altering the attributes of the clone can potentially change the original.

Update: Eric Burke has helpfully provided a tip here. To paraphrase, there is no need to clone immutable fields like Double, Integer, and String

After copying an object in the way I suggest, both share references to the same underlying immutable fields. If you change one of these references in the original (or copied) object, the two objects no longer share references. Since the values are immutable, you can only “change” the value by pointing to entirely new references.

Get his improved code update here.


Unique ArrayList

Saturday, June 9th, 2007

Eric’s post about the LinkedHashSet got me thinking about some code I’ve recently written to provide an ArrayList that operates like a set. I wasn’t aware of the LinkedHashSet, but I think I implemented something with similar functionality called a UniqueArrayList. It looks very simple:

public class UniqueArrayList extends ArrayList {
    /**
     * Only add the object if there is not
     * another copy of it in the list
     */
    public boolean add(T obj) {
        for (int i = 0; i < size(); i++) {
            if (obj.equals(get(i))) {
                return false;
            }
        }
        return super.add(obj);
    }

    public boolean addAll(Collection c) {
        boolean result = true;
        for (T t : c) {
            if (!add(t)) {
                result = false;
            }
        }
        return result;
    }
}

I wanted a List that would only allow an object to be added to itself if there were not another duplicate within, and this completed that objective. When I decided to write this code, I remember trying to decide whether I wanted to make a List unique or provide order to a Set. I guess the Java API has already provided order to a set with their LinkedHashSet, so maybe I should get rid of this object and replace all of them with LinkedHashSets.

I also create a UniqueOverridingList that would actually override equal objects as they were added. This was useful when I needed to process a lot of objects, but only keep the latest version of the object as I was processing.

public class UniqueOverridingList extends ArrayList {

    public enum LAST_RESULT {
        ADD, OVERRIDE, NOTHING;
    }

    private LAST_RESULT lastResult;

    public boolean add(T obj) {
        for (int i = 0; i < size(); i++) {
            if (obj.equals(get(i))) {
                set(i, obj);
                lastResult = LAST_RESULT.OVERRIDE;
                return true;
            }
        }
        boolean b = super.add(obj);
        if (b) {
            lastResult = LAST_RESULT.ADD;
        } else {
            lastResult = LAST_RESULT.NOTHING;
        }
        return b;
    }

    public boolean addAll(Collection c) {
        boolean result = true;
        for (T t : c) {
            if (!add(t)) {
                result = false;
            }
        }
        return result;
    }

    public LAST_RESULT getLastResult() {
        return lastResult;
    }

}

After waiting and reading the comments on Eric’s blog, I see that my classes may act more accordingly if I add the equals() and hashcode() methods to them. The equals method should take an incoming List and compare it element by element to the current list.


I’m in Fortran translation Hell

Tuesday, May 15th, 2007

I learned a lot more low-level concepts because I started programming in Fortran77 before Java. It taught me about some key computer science subjects that were not taught in my CIS curriculum in school. So I have an affinity for the ancient language. But lately I’ve been translating some old Fortran77 code into Java to be used in an engineering application at work, and sometimes it makes me want to grind down my eyeballs with a Dremel tool.

It’s not so much the Fortran77 language as it is the programmer who was responsible for the code I’m working on translating right now. I have to cut the guy some slack because there were no such things as software engineers back in the 70’s. So we ended up with mathematicians, physicists, and engineers doing anything they could with the tools that had available without regard to the software developers who would come 30 years later to rehash their work into object-oriented languages.

Here is an example of one thing that is making my life miserable right now.

Fortran supports “GO TO” statements that allow you to arbitrarily “GO TO” any line of code in the program at any time. While it seems like this might simplify things at times, it NEVER EVER EVER DOES. What the “GO TO” allows is the ability for the programmer to break out of any loop at any time and choose a new point for the execution to flow to. This means that you can jump out of a loop and never return to the place you jumped from. If you happen to jump into another loop, so be it. Fortran will happliy continue along and start looping wherever the execution landed. This can make code almost impossible to understand. In fact, this is probably the source of the “spaghetti code” term.

Logic is sacrificed at some point in the whitespace between the typing of “GO” and “TO”.

If you are an OO programmer, there is no easy and straightforward way to break out of a loop and into another loop in the code, never to return to the previous loop. Even if you call another method to run some repetitious code, you’ll always return to the place where the original execution called the method. So if you want to make an OO language do something similar to what the “GO TO” does, you have to set up a way to break out of the loop immediately after the point where the “GO TO” occurs.

Here is another example of my dilemma. There is something in Fortran77 called an “arithmetic IF statement”. It goes something like this:

IF (VAL) 10, 20, 30

The value is evaluated. If “VAL” is less than 0, the execution will “GO TO” line 10 in the code. If “VAL” is = 0, “GO TO” line 20. And of course if “VAL” is greater than 0, “GO TO” line 30.

Horrible. Just horrible.

To add insult to injury, the programmer who wrote the program I’m currently working on favors the arithmetic IF statement over regular IF statements. So instead of writing

IF (X .EQ. 0) THEN
  CALL STUFF()
END

he writes

IF (X) 10, 20, 10
10 CALL STUFF()
20 CONTINUE

And he does this all the time. I’m going to go dunk my head in boiling pitch now.