Hadoop is a platform for performing distributed computing. That's easy enough to understand, right? There are some add-ons for things such as distributed file storage and distributed database access, but at the heart of it, Hadoop is a processing platform that partitions the work across multiple machines in a cluster.
Read this post for setting it up.
Rinaldo Bonazzo's Blog, outdoor passionate - sometimes likes to blog about Disruptive Technology #EnterprisePortal #BigData #Analytics #Cloud #Iot #node.js ...
Showing posts with label DBMS. Show all posts
Showing posts with label DBMS. Show all posts
Friday, April 15, 2011
Tuesday, March 8, 2011
To be relational, or not to be relational? That's NOT the question!
A Sergio Bossa and Alex Snaps interesting presentation see it on slideshare
Saturday, April 10, 2010
Intercept null value on a query
These last days working to create a Pentaho report with openERP data, I came across a serious problem:
if a query return a null value arithmetic operations are in error. The question was how to intercept the null value, surfing on Google I found this
Again surfing on Google and I found COALESCE
The postgresql manual say:
Finally my problem was solved my query now is something like this
SELECT COALESCE(sum(aal.amount * -1) , 0) FROM account_analytic_line AS aal WHERE (aal.account_id=83) and aal.journal_id = 3
Rinaldo
if a query return a null value arithmetic operations are in error. The question was how to intercept the null value, surfing on Google I found this
SELECT IFNULL (1,0) this query return 1Fine my problem was solved NO!! openERP use postgresql and postgresql doesn't support IFNULL
SELECT IFNULL(null, 0) this query return 0
Again surfing on Google and I found COALESCE
The postgresql manual say:
The COALESCE function returns the first arguments of ITS THAT IS NOT NULL. Returned is null only if all arguments are null. Often it is used to substitute a default value for null values
Finally my problem was solved my query now is something like this
SELECT COALESCE(sum(aal.amount * -1) , 0) FROM account_analytic_line AS aal WHERE (aal.account_id=83) and aal.journal_id = 3
Rinaldo
Sunday, February 28, 2010
Optimizing db performance
A couple of weeks ago my colleague Daniel asked me tips and info on optimizing db.
Afficher en écriture latine
This is only a starting point, in future I will investigate Change Data Capture
Afficher en écriture latine
With the documentation I have, I found a certain number of techniques, some available on opensource DBMS and other only on proprietary DBMS.
- Indexing normaly when you query a table all the table data will be read (full table scan). Indexing data is like indexes in a book. So primary key is a special kind of index, it's the unique identifier for a row on the db. How to build indexes? the answer is not easy, but we must always remember that creating a db is not a project but an ongoing process.
If for example I want to get all the sales ordered by date I will create an index on date_sales field with descending sort order. In this way retrieving last week's sales won't be slow. - Bitmap indexing this is a sepcial form of index (unfortunatly not available on MySql) normaly used in columns with a low number of unique values gender, color, ...). The idea is to create for the distinct number of the values in the column, a bit projection is created. This enable faster access to the indexed values.
- Partitioning introduced in MySql 5.1 it's the capibility to slice a table into multiple physical pieces. For example a large table containing sales data can be partioned by year so that queries for a given year will be performed only on the portion of the data for the requested year.
- ...
techniques, use of products such as Kettle, Mondrian, ...
Good eveningRinaldo
Subscribe to:
Posts (Atom)