Databases and Data Warehouses Archive

DBA, a great position for a CIS major

by Kevin S
The main purpose of a DBA is to perform maintenance and optimization tasks on a daily basis. However, according to Craig Mullins, DBA’s often become much more than that. Because the DBA is often relied upon by both IT and business associates, the realm of what is asked and/or expected constantly grows. A DBA should expect this, and accept it as it makes him/her more valuable to a company while also extending their own personal abilities. Opportunities to grow may include (in addition to the standard DBA duties):  experience with new technologies, a better understanding of the meaning of data, actively participating in application development, or perhaps just a better understanding of business. read more...

SQL and the Future

by Asim K
In his 1998 journal article from The Future of SQL, Craig S. Mullins describes the future of SQL as bright – and although this seems like an optimistic statement, the truth is that it is the present that Mullins was worried about. Mullins begins by explaining why in the 90s SQL had been so successful saying that it’s an abstract and in-depth language that is used to query and provide structure to data. If you know English, then you’ll be able to grasp onto SQL pretty easily – a lot more easily than COBOL, C, or Python source code – which gives SQL the advantage because users can be more productive in a shorter amount of time. He also explains the natural flexibility of SQL, saying that there is more than one way to do one thing and it could be equally as efficient. The “threat of the present” is that SQL was at the time under threat by XML and that SQL was limited in what it could do vs XML. Same goes for Java. The future, Mullin says, is fuzzy logic in congruence with SQL, saying that applying rough human logic to SQL code would help SQL expand and become even more than it was in the 90s. A few of these examples are given in the article (see: citation). Mullin ends by stating that infrastructure is needed in the IT community and SQL provides that infrastructure like nothing else does. read more...

A day in the life of a DBA

by Kevin S
Now that we are creating databases, I thought that I would share an article that I found in where a DBA describes a day in his work and some tips on supporting the daily tasks. The article, named “The Case of the Missing Index”, starts off by going through almost humorously through his day as his co-workers ask him to fix an issue with the DB application. He runs through his routine of troubleshooting and eventually finds the culprit which, as you may of guessed, was a missing index file which had been deleted by a co-worker in order to create more room on the hard drive. read more...

Database Security: Oracle or SQL?

by Asim K
In her 2010 article titled, “SQL Server Most Secure Database; Oracle Least Secure Database Since 2002”, Laura DiDio explores the security and vulnerability of the two leading database systems out: Oracle and SQL. Quite from the beginning, meaning the title, Laura explains how the SQL database is more secure than Oracle – and not just by a tiny margin. During an eight and a half year period, from 2002 to 2010, the NIST CVE (National Vulnerability Database) statistics recorded 321 security related issues for Oracle, the highest of any vendor. This was six times more than that reported of SQL server. DiDio explains that SQL’s unmatched security is not a fluke or luck of draw, it is rather a direct result of Microsoft’s investment in the Trustworthy Computing Initiative, an initiative launched by Microsoft in 2002 where they stopped code development across all product like the scrub the code base and make their products more reliable and secure. read more...

Common Security Mistakes in Web Apps

by Asim K
When creating a website, especially one that is either controversial or recieves a lot of hits, one is entrusted with the golden realm of keeping the website secure. Among just getting the website up, one has to worry about a multitude of issues including general security on the web. Is user data safe? Can an attacker pollute fake data onto your website? These are just some questions author Philip Tellis raises of Smashing Magazine. The first security mistake mentioned is Cross Site Scripting. This would be pulling and executing code from an attacker’s site, except on the server of the victim website. The second is Cross Site Request Forgery, where a website can trick visitors into performing an action onto a victim site. The third valuable tip is Click Jacking where buttons are invisibly coded onto websites to trick users into submitting information they would never have to begin with. The fourth is most relevant to our class: it is SQL Injection. SQL Injectin is a method when an attacker exploits inputs inside the website to gain access to the database server and make changes to it from the control they’ve received. SQL injection even gives the attacker power to run any line of sql code they want on your server, including the drop tables function which drops all information from your database. Similar to this is Shell Injection, which accesses priveledges on websites to add JavaScript or HTML code that is unwanted. The last and most popular is Phishing, which is creating scam websites. read more...

Implementing SQL 2012 for Big Data

by Sam T
In this article, the author discusses about how Microsoft released a new version of SQL Server 2012. With this update, SQL Server 2012 can help organizations analyze large amounts of data otherwise known as big data. Microsoft is promoting SQL 2012 as a very useful tool to observe big data and act as a link between unstructured data platforms and data warehouse based tools. An early user of SQL 2012 has to process about 350 GB of social networking data a day, by using SQL, SQL queries billions of rows of data in seconds and allows the user to iterate and test a large number of scenarios in a short amount of time. read more...

Designers Must Do the Modeling

by Asim K

In his article Designers Must Do the Modeling, which was published in IEEE Software , Volume 15 Issue 2, Brian Lawrence enunciates on the fact that designers for database, or any other project, must do the modeling. Lawrence defines the stepping stone as figuring out the customer’s problem rather than figuring out requirements for the project. In his logical breakdown, Lawrence cites ERD models as only an output of the requirements process which will dictate database  design later on in the process. He pursues this opinion with another, saying that producing the ERD diagram (or whichever type of diagram you may be working with) has the benefit of allowing ourselves to understand the customer’s problem better so we can design better solutions. Because the designers have to produce the requirements model, Lawrence embraces the not-so-popular opinion that the designers themselves are the owners of the model.  Citing a quote by Dwight Eisenhower, the author embraces the planning process over the actual plan. To further reinforce this statement, Lawrence says that managers must help persuade designers to understand that they must model requirements – no matter if the designers see it as their duty or not. Similar to the statement of “Learning by Doing”, Brian Lawrence embraces a brilliant model in saying that it is during the planning phase that we learn the most, not in the implementation of the plan. Personally, I agree with this worldview because I have experienced the  same euphoria myself. When I was younger, around 11 or 12, years old I would sit down to learn HTML as a hobby (yes, HTML was my hobby). Although my websites churned out to look like absolute trash and functioned on a pretty depressing level, in the process of research, working with clients, figuring out bells and whistles, I was able to generate a more holistic understanding  of what I was learning and still retain that knowledge today. On these terms, I agree that the designers – the individuals who actually work with the client to figure out their problems and solve them – are the people who should create the requirements needed for their projects; in our case: a databaase. Lawrence, B. (1998). Designers must do the modeling. Software, IEEE, 15(2). Retrieved from http://0-ieeexplore.ieee.org.opac.library.csupomona.edu/stamp/stamp.jsp?tp=&arnumber=663782.

Database Performance Innovations

by Kevin S
This week I read the journal article called “Revolutionize Database Performance”. This article focuses on 7 innovations that the author feels will revolutionize data warehousing. These 7 innovations are important because RDBMS platforms, which had worked well in the past, have struggled since they were not designed to query against the large amount of data that businesses have today. The seven innovations he discusses include: column store architecture, aggressive compression, multiple sort orders, automatic database design, recovery by query, concurrent data loading and querying, and standards-based appliances. According the author, adopting new approaches to RDBMS design will allow detailed searches to be processed 50 or 100 times faster and at a fraction of the cost. read more...

Big Data and Language: How do they relate?

by Asim K
In an interview between WIRED.com and Martin Wattenberg, a mathmatician and computer scientist at IBM’s Watson Research Center in Cambridge, Wattenberg expresses the importance of “Big Data”. Wattenberg specializes in large texual data sets, meaning that he focuses on terabytes of language. Wired.com peruses Wattenberg’s brain in inquiring why is he so captivated by reading such data sets – Wattenberg points out the importance of language. Language is one of humanity’s core mediums in which we are able to read, explore, and encode our identity as human beings, much like the blog post I’m typing right now. For example, one may have a database in the petabytes of words, literature, books, and yet we can see that even twelve words from voltaire, and example given by Wattenberg, can hold a lifetime of experience. And so when there are petabytes upon petabytes of information to analyze, Wattenberg has created a visual representation of everyone’s favorite quick-information website: Wikipedia. The visual representation assigns a color to each word in the dictionary and then maps out the usage amount of each word – this helps Wattenberg analyze data easily. read more...

Moving from Logical to Physical

by Kevin S
As we prepare to take the next step in database design, it is important to relate the new material with what we have already learned. In the article “Logical Versus Physical Database Modeling”, authors Ryan Stephens and Ronald Plew do just that. They describe data modeling as “a link between business needs and system requirements”. They summarize the logical model deliverables as including the ERD, Business process diagrams, and user feedback documentation. Where as the deliverables for physical modeling includes server model diagrams and its feedback documentation. read more...