Saturday, June 25, 2011

Embracing non-determinism



Computers are supposed to be deterministic. This is often the case for single processor machines. However, as you scale up, guaranteeing determinism becomes increasingly expensive.

Even on single processor machines you are facing non-determinism on semi-regular basis. Here are some examples


  • Bugs + poor OS memory control that allows programs to read uninitialized memory. A recent example for me was when I used FontForge to process a large set of files, and had it crash on a different set every time I ran it.
  • /dev/random gives "truly random" numbers, however, even if the program uses regular rand() seeded by time, it's essentially non-deterministic since so many factors affect the time of seed()
  • Floating point operations on modern architecture are non-deterministic because of run-time optimization -- depending on the pattern of memory allocation, processor will rearrange the order of your arithmetic operations, producing slightly different results on reruns
  • Soft-errors in circuits due to radiation, about one error per day, according to stats here
  • Some CPU units are bad and produce incorrect results. I don't know exact stats on this, but the guy who did the Petasort experiment told me that it was one of the issues they had to deal with.
.

Those modes of failure may be rare on one computer, but once you scale your computation up to many cores, it becomes a fact of life.

In a distributed application like sorting a petabyte of data, you should expect about 1 hardrive failure, 100 irrecoverable disk read errors, and a couple of undetected memory errors if you use error correcting RAM (more if you use regular RAM).

You can make your computation deterministic by adding checksums at each stage and recomputing the parts that fail. However, this is expensive, and becomes disporportionately more expensive as you scale up. When the Czajkowski et al sorted 100 trillion records deterministically, most of the time was spent on error recovery.

I've been running an aggregation that is modest in comparison, going over over about 100 TB of data and getting slightly different results each time. Theoretically I could add recovery stages that get rid of non-determinism, but at a cost that is not justified in my case.

Large scale analysis systems like Dremel are configured to produce approximate results by default, otherwise, the long tail of packet arrival times means that you'll spent most of the time waiting for a few "stragglers"

Given the increasing cost of determinism it makes sense to think of it in terms of cost-benefit analysis and trade off the benefit of reproducible results against the cost of 100% error recovery.

A related idea come up in Vikash Mansinghka presentation at NIPS 2010 workshop on monte carlo methods: Real world decision problems can be formulated as inference and solved with MCMC sampling, so it doesn't make sense to spend so much effort ensuring determinism only to destroy it with /dev/random -- you could achieve greater efficiency by using analogue circuits to start with. His company, Navia, is working in this direction.

There's some more discussion on determinism on Daniel Lemire's blog

18 comments:

  1. I agree. It is not only hardware errors. It is also the order of execution. Perhaps, most non-determinism comes from multi-threaded applications.

    ReplyDelete
  2. Anonymous12:27 AM

    Get access to all available offers from paypal for free and it all works. Redeem free paypal cash and get many more. Good and very informative. Love to share such blogs with all my friends same as the website which is menat to be shared free paypal money adder as it is a good public welfare work to do, this mentioned website has free paypal money and gift cards to offers. its real and working, Visit today.

    ReplyDelete
  3. Thank you for your post. This is excellent information. It is amazing and wonderful to visit your site.
    digital marketing internship in vijayawada
    project internships in vijayawada

    ReplyDelete
  4. amazing content really informational !!

    check our site too :pytholabs

    ReplyDelete
  5. In a distributed application like sorting a petabyte of data, you should expect about 1 hard drive failure, 100 irrecoverable disk read errors, and a couple of undetected memory errors if you use error-correcting RAM (more if you use regular RAM).
    that type of error is really hit on chats apps or software on the device

    ReplyDelete
  6. In need of nursing assignment help uk turnout to Assignments Planet for all assignment services at a cheap price.

    ReplyDelete
  7. yo whatsapp apk download thanks for sharing download more features whatsapp

    ReplyDelete
  8. Anonymous9:59 AM

    Thanks for amazing information. Can you want to buy Thigh High Socks
    online?

    ReplyDelete
  9. Thanks for the post.. Here are the some interesting posts regarding your niche. Check now- clash of lights. cf

    ReplyDelete
  10. Thanks for the post.. Here are the some interesting posts regarding your niche check the new blanket hoodies.big hoodie blanket

    ReplyDelete
  11. Great article!Check the new All by itself, blanket hoodies are fantastic outerwear. It is a spicy addition to your winter wardrobe. It makes you look cool, stylish, and cute, all simultaneously.wearable blanket hoodie

    ReplyDelete
  12. Are you looking for games darksoul of clash, having more features than a clash of clans, then you are at the perfect place we have a fantastic game with impressive features.

    ReplyDelete
  13. Anonymous6:17 AM

    Amazing! This is very usefull for me. Also check hoodie blankets

    ReplyDelete
  14. One of the choices in Georgetown, Texas for fitness centers is Gyms Georgetown Fitness Centre. For individuals living in the area who are looking for a place to work out, Gyms Georgetown TX offers a variety of amenities and services to help them reach their fitness goals. With state-of-the-art equipment, experienced trainers, and group fitness classes, Gyms Georgetown TX is a popular destination for those who want to stay active and healthy.

    The gym's convenient location in Georgetown makes it easily accessible for residents and visitors alike, making it a great choice for anyone looking to improve their physical well-being.

    ReplyDelete