Archive for October, 2009

SI prefixes for hard drive sizes

Posted on Thursday, October 15th, 2009

An interesting change in Mac OS X 10.6 is that Apple switched the representation of file and hard drive sizes to base-10 SI units instead of using binary prefixes. This means that a 60 GB hard drive which usually shows up as 55.8 GB will now appear within a few megabytes to the advertised 60 GB. I wasn’t really that happy with the idea at first, but when you think about it, it actually makes a lot of sense.

Bits and bytes
Now, it seems logical that since a computer works in binary, file sizes need to be represented with binary prefixes too. But in reality, the number of bytes in a kilobyte is completely arbitrary – since the operating system and the applications that run on it only represent file sizes in bytes. We could decide that there were 2504.25 bytes in a kilobyte and the computer wouldn’t care at all. If we’re just shortening the sizes for our convenience, why make it so hard for ourselves? Doesn’t it make more sense that a 1500KB file is 1.5MB, and not 1.464MB?

But what about file transfers?
I’ve heard people say that changing file sizes to SI units will make working file transfer speeds confusing, but the opposite is actually true – bitrates are actually already represented in SI units, so changing how files are represented on disk would make it easier to work out how long files would take to transfer.

Ram and SSDs
Now, since hard drives are designed around SI units, it makes sense to use SI units to represent its capacity and file size in SI units. But what about solid state drives and RAM? Well, RAM should stay as binary sizing, but as for solid state drives, it doesn’t really matter. The only difference is that a SSD sold as 256 GB (which is actually 256 GiB using the correct prefix) would show up as being 274.9 GB instead. Does this really matter? We’ve put up with our hard drives showing up as being smaller for so many years (a 1TB drive appears to be 92.677GB smaller), so I don’t think it’d be too bad.

Really, all we need is standardisation. A good first step to this would be to show RAM, solid state drives and hard drive sizes using the correct prefixes (GB and MB for SI units and GiB and MiB for binary). Unfortunately ‘mebibyte’ and ‘gibbibyte’ sound pretty stupid compared to megabyte and gigabyte though… But that’s all the more reason to switch to SI units.

C++0x is insanely awesome

Posted on Tuesday, October 13th, 2009

I came across an FAQ that outlines some of the features coming in the new C++ standard, C++0x, which should be finalised sometime this or next year. It really has some awesome features that, in my opinion, brings some of my favourite features of managed languages like C# to the much faster and more efficient C++. I’m already using some of the features in my applications, as a few are available as part of the Technical Report 1, in the std::tr1 namespace. A lot of the best features (like the concurrency features) are still to come in most compilers though.

Here are some of the new features that stand out to me:

Smart Pointers

One of the biggest complaints about C and C++ is that you have to manage your own memory, which means that allocating memory and then forgetting to free it can cause problems like memory leaks. The current version of C++ does have a smart pointer, called std::auto_ptr, that can help with this, but it’s not really very good, because you can only have one reference to the pointer it manages. C++0x now features the std::shared_ptr, which is far better.

Concurrency

C++ by itself has been a completely single threaded language in the past, but in C++0x, we will be able to use threading, locks, and other concurrency features without having to rely on third party libraries, which are usually quite platform specific.

Hash tables

C++ already has the std::map, which can map an object to a key, such as a string, but this has a log(n) lookup time – that is, retrieving a result will get slower as you put more items in it. C++0x now features the std::unordered_map, and three other hash tables which have a faster, constant lookup time. The tradeoff is that adding items to a hash table can be slower, because it has to dynamically resize the table.

The unordered_map should really be called something like hash_map, but there are some compilers and third party libraries that already use that name.

Range For

This is one of those things that is going to be a big timesaver in writing C++ code – whereas previously for iterating through a container like a map, you’d have to do something like this:
for(std::tr1::unordered_map::iterator it = m_Elements.begin();
it != m_Elements.end(); ++it)
{
// Draw the widget
it->second->Draw();
}

now you can use the range for statement, which lets you do this:
for(Element::Ptr x : m_Elements)
{
x->Draw();
}

This is a lot like the foreach statement in C#, which is really nice.

Tuples

Tuples are like arrays that can hold different types, and they are really handy if you want to return more than one value from a function. Instead of constructing a class or struct to hold the data, which would take a lot of extra code, you can just construct a tuple for it. Tuples are extensively used in Python, and are extremely handy.

There are a lot of other great changes coming to the language, and the FAQ has a lot of good examples. The only problem now is waiting for the compilers to support the new standard…