Data Archive

SQL Injection Remains a Constant Threat

by Brian B
The article that I picked this week is named “Black Hat is Over, But SQL Injection Attacks Presist” by Victor Cruz. The article starts off by talking about an attack that happened earlier this year to yahoo that resulted in a break and leak of 400,000 of usernames and passwords from Yahoo. It says that SQL attacks have also affected companies such as Sony and LinkedIn recently, so this is obviously still a large threat to companies. The author gives an example of SQL injection saying that “hackers visit a website and fill out a text field with a SQL statement such as 1+1=2, which the log-in field interprets as true, allowing it to pass as legitimate credentials (Cruz, 2012).” This causes the server to release confidential information accidently because it has been tricked into thinking that a valid user has logged into the system. The article goes on to state that “Privacy Rights Clearinghouse reported that 312 million data records have been lost since 2005 and 83% of hacking-related data breaches were executed via SQL injection attacks (Cruz, 2012).” The article goes on to talk about how they are developing more reliable software to look for SQL injections and decipher them from safe input. The tool in question is called “libinjection” and is able to sort through heaps of data by converting input into tokens and checking those resulting tokens for anything that maybe being sent to try and attack the server. The article finishes by saying that “SQL Injection attacks are automated and website owners may be blissfully unaware that their data could actively be at risk (Cruz, 2012).” read more...

Optimizing SQL Server Performance

by Kathy S
In this journal article the authors state that many times efficiency and performance are the last criteria considered when designing and developing new applications using a database. Sometimes the application does not display the information requested to the database in a reasonable time or completely fails to display it. The reasons may be related to the application design, but in many cases the DBMS does not return the data quickly enough, due to the non-use of indexes, deficient design of the queries and/or database schema, excessive fragmentation, use of inaccurate statistics, failure to reuse the execution plans or improper use of cursors. The authors then review the objectives that should be considered in order to improve performance of SQL server instances. The most important objectives are: 1) Designing an efficient data schema, 2) Optimizing indexes, stored procedures and transactions, 3) Analyzing execution plans and avoiding recompiling them, 4) Monitoring access to data, 5) Optimizing queries. The authors conclude that optimization is an iterative process and includes identifying bottlenecks, solving them, measuring the impact of changes and reassessing the system from the first step as to determine if satisfactory performance is achieved. They also highlight the fact that a superior performance can be obtained by writing an efficient code at the application level and properly using the design and database development techniques. read more...

Relational Database 101

by Kathy S
For those students who have no prior experience or knowledge of database design, this is a great read for you. The author of the article explains the introductory terms and information about how data is organized and represented in a Relational Database. The following are the basics one needs to know. In a relational database, data is stored in a two-dimensional matrix (table) and within the table there are multiple columns and rows. Relational Database Management System (RDBMS) software is used to give ability to users to read and manipulate data. The RDBMS relies on SQL (Structured Query Language) constructs and keywords to access the tables and data that is contained within the tables’ columns and rows. It clarifies that each table in a relational database contains information about a single type of data and has a unique name that is distinct from all other tables in that schema (a grouping of objects/tables that serve a similar business function). The author points out the key to good relations. The primary key is very important; it is a column that ensures uniqueness for every row in a table. The author then explains how a relational database connects (relates) tables and organizes information across multiple tables. The foreign key is an important connector that identifies a column or set of columns in one table that refers to a column or set of columns in another table. The author then states that the key to understanding relational databases is knowledge of data normalization and table relationships. The objective of normalization is to eliminate redundancy and thereby avoid future problems with data manipulation. There are 5 most commonly accepted normal forms, but many programmers, analysts, and designers do not normalize beyond the 3rd normal form,  although experienced database designers may. The author goes on to talk about what 1st, 2nd, and 3rd normal forms look like. Lastly, the article mentions how SQL fits in. SQL helps to create new data, delete old data, modify existing data, and retrieve data from a relational database. read more...

Defining Terms in Data Models

by Kathy S
There’s a challenge when it comes to defining terms for data models. The author of the article asks the question, “Does defining the actions something performs solve our definition issues? Or are we instead adding complexities, for example, assigning more than one meaning to the same data element.”  The responses for those questions were grouped into 3 categories. “Defining a term by its actions is an effective technique” according to Madhu Sumkarpalli, who is a business intelligence consultant. He says it is better because that way they can be specific about the term or close to specific rather than being generic and abstract. Basing the term on its actions can define it appropriately and paint the proper picture. “Defining something by its actions is part of the solution” according to Amarjeet Virdi, who is a data architect. He says data entities are meant to represent real life objects and those objects perform functions. Then a new question comes up when the object ceases to perform its function then what? does it cease to exist or have no value to the business anymore? Complexity increases. “Defining something by its actions is not recommended” according to Wade Baskin, who is senior database architect. He says mixing process with data is a dangerous practice. Data should have only one definition regardless of the process. If the data changes as it matures then the change is reflected as a different data element. It is not good to change the current definition of an element based on process or location. Allowing fields with multiple meanings is dangerous and should be avoided. The author feels that defining a term by what it does is effective and it is at least a starting point because most business professionals define things by the roles they play. For example a person playing the role of a customer. The problem is though, is that this approach may eventually lead to data integration issues, hidden business logic and the question of what will happen to the term itself when the activity it performs stops. read more...

This Week on Survivor

by Asim K
10,000 engineers and business people gathered together on September 30, 2012 to discuss the future of a company that made extraordinary announcements for computing that brings promise of a revolutionary future for people everywhere. That company isn’t Apple. Although unknown to most, these suited men gathered to hear about Oracle’s venture into the cloud. NASA missions started the trend decades ago – now it’s the turn for computing companies. To be clear: cloud computing does not take place in the physical clouds – rather in a virtual cloud, and the trend is growing. With the advent and talk of Big Data(bases) which recede into thousands of petabytes – of which I know not the term for – Oracle is gearing up to combat the likes of giant industry know-and-sell-it-all Amazon and the infamous, popular-kid-in-highschool Google, with their new cloud database titled 12c. “He said the new hardware could shift data twice as fast as machines from EMC, and costs one-eighth as much as machines from I.B.M.” (Hardy), and this is no small promise. As Hardy explains in the article, his so called superior database would be the first like it in the world and competitors are expected to push back. Hard. read more...

DATA TRIGGERS

by Claudia J
The article that I read was about the good and bad things about MySQL triggers features. The article first explains the reader that triggers is a named database object that is associated with a table that becomes active when an event occurs for the table. Programmers use triggers as a way to perform checks on the values that are being inserted into a table. It ensures that the data complies with the standards that the fields have set such as calculations on values that involve an update on the data being added to the database. The triggers can be set to activate before or after an inset, deletion, or update has been done in a table.  All the information of the database triggers are stores in a triggers table called INFORMATION_SCHEMA database. read more...

The Data Model Owner

by Kathy S
The author (of the article I read) is a data modeling consultant and instructor, and he asked the question,”Who owns the data model?” He received over 100 responses.  It is known that an organization has a team of skilled developers, and these developers do the data modeling. When a developer from this organization asks,”Who owns the data model?” How does one respond? 37% say the business should own the model because “the modelers don’t ‘own’ the data model – they are only the caretakers of the model.”  22% say the development team because as the owners, they would keep all the members informed about changes while maintaining the ability for everyone to make changes to the model as needed. 15% say the individual developer because they have the skill and knowledge to manage the data model well. 11% say the application manager or database administrator because they are the ones who feel the pain when the application is not working properly and they are the ones who are contacted first if the application isn’t working as it should. 15% say no one owns the model simply because no one may have the big picture in mind. read more...

4 Business models for an effective Governance program

by Garcello D
The article I have chosen for this week’s blog is called “The Data Model’s Role in Data Governance,” It’s written by Jonathan G. Geiger and in the article he basically describes multiple levels of data models. At first I had no idea what Data governance was but after a little bit of research I found out that “Data governance refers to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise”(www.searchdatamanagement.com). The author believed that data models play a bigger role, just as big as the other components when it comes to the data governance program. The Author then proceeds to compare and explain the subject-Area model, the business data model, the system data model, and the Technology model in detail. He also connects the 4 as he shows why they would be essential for an effective governance program. read more...

FBI Moves from Paper to an Electronic Database

by Eric C
The FBI has always been known for their high-tech equipment they use from computer forensic equipment to the specialized tech gear that agents use. However, it wasn’t until recently when the FBI finally completed a new database system that eliminates paper files and moves to a more modern approach of digitizing paperwork. This new electronic file management system is called Sentinel and it was originally scheduled for completion in 2009 with an estimated budget of $425 million. Due to delays and poor planning and organization, it was about three years late and about $26 million over budget. This project of transferring the FBI to an electronic system was contracted to Lockhead Martin Corporation, who managed the project, but was taken over by the FBI due to delays. read more...

Glacier Cloud Service By Amazon

by Hongde H
Glacier Cloud Service by Amazon.

The article I chose to read for this week is “Amazon’s Glacier Cloud Service Puts Data in the Storage Deep Freeze” by David Hill. Glacier is a cloud service recently introduced by Amazon Web Service (AWS). It is a low-cost storage service for the long-term preservation of data that owners do not hope to see again. For those who are not familiar with the Glacier, I’d love to share with you this new storage service by telling what the glacier can be used for, what it brings to us and why people would like to stick with it. read more...