Thursday, May 29, 2008

Groovy SQL - That was easy


I got the chance today to get my hands dirty with GSQL (Groovy's Built in support for SQL), and I must say that it was rather easy. What I needed to do was connect to an embedded Derby database, parse a file, and populate the table with the results. Not to hard right. So, here is what my data looked like in the file I needed to parse:


2008/04/06,pubsubproxy,6166,4061
2008/04/07,pubsubproxy,6166,4061
2008/04/08,pubsubproxy,6170,4061
2008/04/09,pubsubproxy,6170,4061
2008/04/10,pubsubproxy,6170,4061
2008/04/11,pubsubproxy,6170,4061
2008/04/12,pubsubproxy,6170,4061
2008/04/13,pubsubproxy,6170,4061
2008/04/14,pubsubproxy,6170,4061

The best part about this was as I read each line in the file I added the data to the dataset for the table I was populating. The following shows a quick example of how to do this:

static parseData(path) {
println("parsing data for the following path: $path")
setDBSystemDir()
def loc = new File(path)

//Get an instance to the derby database
def sql = Sql.newInstance("jdbc:derby:dashboard;create=true", "org.apache.derby.jdbc.EmbeddedDriver")

//For Each line in the file, split the line based on the comma
loc.splitEachLine(',') {fields ->
def pid = fields[1]
def locResult = Double.parseDouble(fields[2])
def tloc = Double.parseDouble(fields[3])
def timeStamp = getDate(fields[0])

//Get the linesofcode dataset from the database
def ds = sql.dataSet("linesofcode")
//Add the properties from the current line in the file to the dataset.
ds.add(
PID:pid,
TIMESTAMP:timeStamp,
LOC:locResult,
TLOC:tloc,
)
}
println "Done Parsing Data"
}
So as you can see the first thing I do is get a new instance to my database:

def sql = Sql.newInstance("jdbc:derby:dashboard;create=true", "org.apache.derby.jdbc.EmbeddedDriver")

Next is the best part. I loop over each line in the file, splitting the line based of the "," and then dive right in poplulating my dataset with ds.add().

That was easy right? I thought so.

2 comments:

jlorenzen said...

Nice job.
It would be nice to see the table structure (ddl).
mytable {
PID String
TIMESTAMP Date
LOC String
TLOC String
}

Also are you certain you don't need to close the file or any objects related to the db such as sql or ds? Also, reopening heavy resources (ie ds) are costly so watch out for creating new dataSets in a loop.
Obviously for high performant applications. If this is just for populating a db manually than don't worry, but just some things to consider.

Chad Gallemore said...

James, thanks for the comments. Those are definately some things I will take a look at and see if I can improve this. My DB skills are lacking, so I'm sure there are some lessons learned I can take from this.