Thursday, December 13, 2012

SQL vs. NoSQL

SQL vs. NoSQL Samuel Warren CS416: Database Management Professor Noel Broman December 10, 2012 SQL vs. NoSQL Executive Summary Since 2004, there has been a debate raging between using relational databases and using SQL without the interfaces of times past, called “NoSQL.” This debate is not one even Google has settled. In a 2012 video on YouTube, Google developers presented a debate between SQL and NoSQL. The debate reached a dead tie with the developers agreeing that a pairing between the two would be a likely solution, at least in the short term. Introduction Arguably, one of the greatest resources to any database administrator is Structure Query Language (SQL). However incredibly fast and powerful as it can be, there is a contender for the throne of greatness in this field. “Not only SQL” (NoSQL) is a movement to do away with relational databases altogether. The first usage of the term, in modern context, was in 1998 by Carlos Strozzi; “Ironically it’s relational database just one without a SQL interface. As such it is not actually a part of the whole NoSQL movement we see today” (Haugen, 2010). Haugen goes on to share that in 2009 Eric Evans, who was at the time employed at a hosting company called “Rackspace,” used it to refer to a more recent uprising of non-relational databases. Strong pros and equally resilient cons have been presented for both SQL and NoSQL in the debates. SQL SQL uses “tables” and “columns” to store the data that is input. There are huge pros to this because it gives each piece of data a never-changing location that can be referenced if one labels and links back correctly. Getting the data into the database is not terribly challenging. Removing data is not difficult either, if one can determine the correct syntax and taxonomy of SQL. SQL is a simple query language that is highly repeatable and flexible. The inconvenience of SQL is the convoluted nature of linking so many different data types together to get one or two specific pieces of data. When compared to NoSQL, however, the ease of breaking down complex problems becomes a boon. Let’s say that you want to compute the average age of people living in each city. In Cloud SQL [a specific product by Google], it’s as easy as this. All you have to do is select the average age and group by city. (Google Developers, 2012) The query shown by the presenter was clear, easy to read, and syntactically the same as every other SQL query used by every database administrator, or analyst, working with SQL. This serves to illustrate the muscle of SQL queries and further demonstrates ease of use. It is a hands-down winner in comparison to NoSQL with respect to queries. When discussing the trade-offs between the two, one of the major reasons SQL has managed to thrive is that it has been refined to the extent anyone can learn a few commands and begin writing complex query-strings. Of NoSQL-based systems: “They’re not polished, and comfortable to use. They have new interfaces, and new models of working, that need learning” (Snell-Pym, 2010). While one can quickly pick up a SQL-based system and begin extricating information, it is not as easy to do so with a NoSQL-based system. According to Kahn, “ [A] user can access data from the database without knowing the structure of the database table” (2011). That kind of structure is invaluable for resource managers needing to find staff who can handle a relational database with the power of SQL. NoSQL On the other side of the playing field, so to speak, is the non-relational model lead by several NoSQL, open-source, contenders. Most agree that the “no” stands for “not only”—an admission that the goal is not to reject SQL but, rather, to compensate for the technical limitations shared by the majority of relational database implemen¬tations. In fact, NoSQL is more a rejection of a particular software and hardware architecture for databases than of any single technology, language, or product. (Burd, 2011) This rejection to some of the technical limitations has revealed highly desirable features in the process. The most notable feature is the ability to quickly scale the database in the event of extreme transactions. Burd goes on to explain that with traditional SQL, as transactions between the servers and the databases increase to a frenzied pace and the queries become larger and larger, the only real response is to put more hardware and storage into the path of the database. Although each of these techniques extended the functionality of existing relational technologies, none fundamentally addressed the core limitations, and they all introduced addi¬tional overhead and technical tradeoffs[sic]. In other words, these were good band-aids but not cures. (Burd, 2011) NoSQL enables much quicker information discovery, because the data lives within what the Google Developers called “entities” (2012). Whereas, the customary relational database uses different tables and has to look up the data within those tables using relational keys. The tables are then linked together using what SQL calls “JOIN” functions from within an individual’s query-string. It is simple to observe a decrease in performance unless the database is on a mature enough system that is well laid out. As the business model evolves concepts and data models often struggle to evolve and keep pace with changes. The result is often a data structure that is filled with archaic language and patched and adapted data. As anyone who has had to explain that the value in a column has a different meaning depending on whether it is less than or greater than 100 or that "bakeries" are actually "warehouses" due to historical accident knows that the weight of history in the data model can be a serious drag in maintaining a system or incorporating new business ideas. (Rees) Rees illustrates a common problem among all systems: change. As data changes, the current and dominant relational model may become extinct. However, SQL may not be up to the task of continuing to store data and serve data in its current fashion. As new as it is, NoSQL may quickly become the standard SQL is today. With such flexibility, NoSQL only needs more companies, like Google, to accept it and learn how to work with both SQL and NoSQL alike in the interim. References Burd, G. (2011, October). NoSQL [PDF]. Retrieved December 12, 2012 from the World Wide Web http://static.usenix.org/publications/login/2011-10/openpdfs/Burd.pdf Google Developers. (2012, June 29). Google I/O 2012 - SQL vs NoSQL [Video file]. Retrieved December 12, 2012 from the World Wide Web http://www.youtube.com/watch?v=rRoy6I4gKWU Haugen, K. (2010, March 16). A brief history of NoSQL [Blog post]. Retrieved December 12, 2012 from the World Wide Web: http://blog.knuthaugen.no/2010/03/a-brief-history-of-nosql.html Kahn, A. (2011, November 8). Difference between SQL and NoSQL: Comparision. Retrieved December 12, 2012 from the World Wide Web: http://www.thewindowsclub.com/difference-sql-nosql-comparision Rees, R. (n.d.). NoSQL comparison. Retrieved December 12, 2012 from the World Wide Web: http://www.thoughtworks.com/articles/nosql-comparison Snell-Pym, A. (2010). NoSQL vs SQL, why not both? Retrieved December 12, 2012 from the World Wide Web: http://www.cloudbook.net/resources/stories/nosql-vs-sql-why-not-both