Database

Database Optimization: Genetics {3}

by Austin P
As the need for databases increases optimization is the next step as a natural form of evolution. Everyone wants better, faster, and more efficient technology including databases. Givon Zirkind is the author of an academic journal which talks about the Optimization of databases that involve genetics. Zirkind writes about, with data storages increasing their memory size at a cheaper cost than before optimization should be easy. Unfortunately, since many programs that are created are not as efficiently coded as they can be and other minor altercations and factors leads to software bloat. According to Zirkind, “Software bloat is when a computer program has so many features, that a user cannot possibly know them all and use them all”. Zirkind writes about a project he did to decrease the amount of bloat and excess data by articulating a specific software design and specifications. Some of the ideas the group as well as Zirkind used was indexing method selection criteria and programming language selection. The indexing method selection involved the use of complex mathematics to create superior access speed over a Linked List using B-tree. B-Tree is an organizational structure for storing information and retrieving the information in the form of a tree. As for the programming language that Zirkind chose to use was the C language due to its performance and portability. After applying the software design and specification phase the next step was to optimize, this was through key compression and index size reduction. Not only is the key compression and index size reduction important having what Zirkind calls “good engineering” is a huge factor towards optimization. Zirkind clarifies that a good engineer is one that is simple. Engineering in databases and especially in technology needs to keep code and other information simple by reason that the more information that is used in code or other sources takes up more memory. In databases this means that load times are longer than needed.  The practices that Zirkind and the group used made a significant increase in the efficiency of their genetic database by 7 to 9 times the original access speed over the databases they used for testing. Also according to the article, the database normally used 7 disk access to record all the data within the database, however, with the new optimization the reduced the use of disk access to a maximum of 2. The reduction of the use of disk access was by recording data loaded into memory and record blocking. read more...

2014, New Year For SQL Server {1}

by Patrick B
The next step in Microsoft’s Relational Database Management System looks promising for efficiency and speed in transaction processing. Microsoft recently held a community technology preview for the massive project and this is what was found. SQL Server 2014 will come with their new In-Memory Online Transaction Processing (OLTP) feature called Hekaton that is a built in part of the database system. Hekaton works by selecting data that is being read or written more frequently and moves it into the server’s working memory. It allows for the priority data to be quickly accessed and ready for transactions or updates on the fly. By enabling this Hekaton optimization, it detects which data should be used by the working memory and optimizes the specified data into main memory. Integrity of the data is maintained by writing any transactions to a separate log file in case of system outages. Beyond just the increased speed, companies can expect to see a saving in cost as Hekaton would reduce the computational requirements necessary to get data processing done, which would require less servers and hardware. read more...

Database Security, Secerno Solutions {6}

by Palek S

Database Security (IT’s biggest problem)

            The author of this article, a security guru and managing director of UK company NGS David Litchfield, discusses database security and IT’s biggest problem by referencing the Black Hat conference where he exposed over 20 vulnerabilities in IBMs Informix database products. In this presentation I will discuss the top two most prevalent areas of weaknesses in database and new technology introduced as Secerno to supplement data security offerings and protect against hackers and data breaches. read more...

Mongo DB (No SQL database for web) {6}

by John J
Today’s highly social and interactive web has created a market for a database management system with the ability to offer fast real time access over the Internet while managing massive data sets that are growing by the minute in volume and complexity. MongoDB fills this need. As I will explain later in this blog, MongoDB is not the perfect solution for every project, but for certain tasks that are within it’s niche, it is the best solution. read more...

Compromised Data Quality by Malware {Comments Off on Compromised Data Quality by Malware}

by Eric C
In an article from PCWorld entitled “Symantec warns of malware targeting SQL databases,” there has been a spread of malware infecting SQL databases around the world. Although not a serious threat, it could pose to destroy data quality within the database. Originally targeted to Iran, the malware called W32.Narilam, looks for Microsoft SQL databases on the infected server. If Microsoft SQL is found, the malware then finds specific keywords from a file, such as account and financial bond, and then replaces those keywords with random characters. Database administrators who do not make frequent backups of the database will have corrupted data and the loss of data integrity, which could prove disastrous for customers’ data, especially in a banking database. read more...

SQL, Easier in the Future {Comments Off on SQL, Easier in the Future}

by Shigom H
Depending on the task, writing SQL queries can get complicated, in the article “Interactive SQL Query Suggestion: Making Databases User-Friendly”  Ju Fan proposes a neat tool for making SQL easier. By simply inserting keywords the tool will generate the corresponding SQL statements. This is similar to Microsoft excel, those who are unfamiliar with excel can type in keywords and excel will find the corresponding formula. SQLSUGG  is a program that suggest queries while the user types. Many databases already offer this type of  functionality but SQLSUGG differentiates itself by its ability to suggest advance queries. Unfortunately most database systems that offer this keyword functionality generate SQL queries from simple keywords that are only beneficial to “casual users”  but deemed useless to database administrators and SQL programmers. read more...

MySQL: A threat to bigwigs? {1}

by Asim K
David Kirkpatrick, of Fortune.com and contributor to CNN Technology, writes in his article, “MySQL: A threat to bigwigs?” that MySQL is the rising Linux of backend systems. He begins by explaining the structure of MySQL (which is based on the SQL language) and the structure of general open source software. Because MySQL is open source, it has a lot more potential to grow when placed next to the pace and speed that commercial software is growing because of the huge “fan base” of developers that open source attracts. The catch: not only are these “fan” developers users of MySql, but they are required, by MySQL’s ethical policy, to share any changes they may have made to the code to the MySQL database – which is in the end a win-win situation for both developer, end user, and MySQL as a public service. As Kirkpatrick points out in a citation of a quote made by MySQL CEO Scott McNealy, huge companies like Yahoo and Google depend on MySQL to get their work done – and if they can do it, so can smaller companies. Compared to a $395 per year for a server, compared to Oracle’s $20,000, MySQL is a no brainer. Kirkpatrick ends by stating that although there are still shortcomings in the free opensource software, MySQL has a huge future ahead of it, as confided in the confident words of it’s CEO, Scott McNealy in saying, “People ask me ‘What’s wrong-why are you leaving money on the table?’ We say ‘You should ask the other database companies what is wrong with their cost structure.” read more...

Dimensions of Data Quality {1}

by Kathy S
The author of this article starts off by introducing the idea of “dimensions”, such as accuracy, consistency and timeliness and asks if these “dimensions” actually exists as intelligible concepts?  The author believes a strong case can be made that we are not thinking as clearly as we can be in this area, and that there is room for improvement. He then asks where does the term “dimension” come from when talking about data quality? In context of data quality, dimension is used as an analogy. The term gives the impression that data quality is as concrete as a solid object and that the dimensions of data quality can be measured. In data quality, the term dimension could be used interchangeably with criterion, a standard of judgment. Since data is immaterial, stating that the dimensions can be measured is an astonishing claim. The author then asks, are the dimensions credible? The more “duplication” there is in a list alongside “completeness” and “consistency”, the lower data quality likely it is, while the more completeness there is the higher data quality is. Therefore, the inclusion of “duplication” in a list of dimensions of data quality immediately creates lack of consistency in the list. A much more serious problem is that there seems to be no common agreement on what the dimensions of data quality actually are. Lastly, the author asks, are the dimensions over-abstractions? A worry is that each dimension is not a single concept, but is either a collection of disparate concepts or a generalization. read more...

Happy B-day SMS {3}

by Claudia J
Today being the last blog, I decided to write about an interesting article written by John Biggs which talked about the history of the SMS. The article was called “Happy Birthday SMS” which caught my attention right away. This article discussed how a programmer from the United Kingdom in December 3rd from 1992 sent by the very first time a few lines of messages saying “Merry Christmas” from his computer to his friend’s phone. He sent his messages through the “new” technique known as short messaging service (SMS). However, this new way of communication didn’t become so popular and into effect until sever years after. Today there are about eight trillion messages crossing the air yearly where only adults between 18 to 25 years old send about 133 messages a week. read more...

Data Integration – Mashup Service {1}

by Andrew M
The article I read was entitled “Service-Oriented Architecture for High-Dimensional Private Data Mashup” by B. Fung, T. Trojer, P. Hung, L. Xiong, K. Al-Hussaeni and R. Dssouli. This article talks about the integration of data for a service known as Mashup. Mashup combines and integrates data from multiple sources and databases for web applications. The authors use the example of a social media website such as Facebook as an example. Facebook uses Mashup in the sense that it is collecting data from the user in the ways of status updates, Instagram photos, Spotify songs and check-in’s at locations. All of these are examples of data that is being sent in from multiple sources and be combined together. The authors talk through this article about the multiple issues that are faced in protecting user’s data in this type of service. While users usually never give out phone numbers, addresses or Social Security Numbers, much important information is still given out. For example, when a person checks-in at a location, they are broadcasting to everyone that they are at a certain location. This allows potentially unwanted people to have a reference as to where this user is from. read more...