Artificial Data Gravity

Having covered Data Gravity several times on this blog, I thought that it would be time to cover a derivative topic: Artificial Data Gravity.

Recall that Data Gravity is the attractive force created by Data amassing and the needs of Apps and Services to leverage low latency and high bandwidth.

Artificial Data Gravity is the creation of attractive forces through indirect or outside influence. This could be something such as costs, throttling, specialization, Legislative, Usage, or other forms. Below I will walk through examples of Public Clouds creating, exerting, and leveraging Artificial Data Gravity.

Costs : The fact that AWS S3 Is free for unlimited Transfer In-bound traffic along with Windows Azure, are great examples of Artificially encouraging Data to amass internally. By allowing you to put more Data inside of S3 or Azure, this encourages Data Gravity patterns through Artificial means.

Throttling : The Twitter API is a great example with its well known API that allows 350 Calls per/hour. This makes it nearly impossible to replicate the traffic on twitter without special (and very expensive agreements in place).

Specialization : Specialized services such as DynamoDB not only encourage Data Gravity through transfer pricing, but encourage low writes, high reads based on a 1:5 ratio. Not only are you unlikely to ever leave DynamoDB, you are also encourage to write code as write efficient as possible due to costs.

Legislative : There are many laws that restrict the location and govern the security and use of Data, these are not technical or physics related, but artificial means of influencing Data Gravity as mentioned in this GigaOM piece covering the law dictating Data Gravity.

Usage : Dropbox charges each individual user for use of Shared Data (Artificial Usage). This means that each person pays for the Data consuming their storage, however Dropbox is only storing a single copy and pointing all authorized users to that single copy.

There are certainly other forms of Artificial Data Gravity that are not listed in the examples above, if you can think of a concrete example, please comment.

One last note : I’m not saying there is anything particularly wrong with Artificial Data Gravity, however it is something to be aware of as it is one of the behaviors/motivations exhibited from Data Gravity as a whole.

4 Comments

vambenepe says:

2012/02/20 at 3:14 PM

I had pizza for lunch. It made me thirsty, so I went to get some water. I’m pretty sure there’s a good “data gravity” analogy here, can you help me find it? 😉

OK, just teasing. But to be honest you’ve lost me a bit. I think “data gravity”, as you initially coined it, is a great and useful analogy. I’ve happily re-used it in conversations. But I don’t see how the “artificial data gravity” extension to it provides much enlightenment. At least I’m missing it.

I’m not saying the analogy is incorrect. Analogies aren’t correct or not. They’re just useful or not, and the usefulness of this extension of the analogy escapes me.

1. @mccrory says:
  
  2012/02/26 at 10:57 PM
  
  It is the split between Data Gravity created by Massing Data, the need for High Throughput, Low Latency, and the attraction of Apps and Services vs. Data Gravity created by Regulations, Policies, Pricing, and such. One is Gravity by nature, the other is gravity by artificial (made up and manipulatable means). Hopefully that helps?
  
Matt says:

2012/02/29 at 12:56 PM

Data gravity — interesting concept and the first I’ve heard of it. It makes sense. Also it helps explain why certain websites (data sources) can maintain popularity due to their data even though other more technologically advanced sites might pop up later.

Pingback: Vblock data anti-gravity patterns | ViewYonder