Get timestamp of last result from mySQL DB
Recently I ran into a problem where I only wanted to parse search results from twitter that were not already submitted to my database. After a quick google search I came up with very little so I will post some code snips to illustrate how to accomplish this.
First we must obviously open a db connection to the mysql sever
try {
String userName = "yourClient";
String password = "yourPass";
String url = "jdbc:mysql://localhost/yourschema";
Class.forName ("com.mysql.jdbc.Driver").newInstance();
Conn = DriverManager.getConnection
(url, userName, password);
s = conn.createStatement ();
} catch (Exception e) {
System.err.println("DB Connection Broken");
}
Once the connection is established we want to retrieve the last entry in the db and grab the time stamp.
SimpleDateFormat format =
new SimpleDateFormat
("EEE MMM dd HH:mm sss 'GMT'Z yyyy");
Date date = new Date();
Date currentDate = new Date();
ResultSet rs = null;
try
{
s.execute("SELECT TweetTime FROM `data`
ORDER BY ID DESC LIMIT 1;");
rs = s.getResultSet();
}
catch (SQLException e)
{
System.err.println (e.getMessage ());
System.err.println (e.getErrorCode ());
}
while (rs.next()) {
String entry = rs.getString(1);
try {
date = format.parse(entry);
}
catch (ParseException e) {
e.printStackTrace();
date = null;
}
}
if(date.equals(currentDate)){
return null;
}
else{
return date;
}
What the heck is going on here? We first set up two dates, both initialy with the current system time. A SQL call is passed to the server with “SELECT Time FROM `data` ORDER BY ID DESC LIMIT 1″. mySQL has no inherent way to distinguish order in a table UNLESS you have an auto incrementing field. In my case ID is auto incrementing with each new entry. The SQL call is selecting the Time field and specifically only the last one based on the ID field. The rest of the routien compares the time to the system time, basically to check that there was infact entries in the db. If there was not it passes a null entry back to the server, if it finds one it passes the date back. Now that the date is passed the compare with results from the incoming query is accomplished as follows:
if(lastRecordedTweet==
null||tweet.getCreatedAt().after(lastRecordedTweet)){
tweetInfo = injestTweet(tweet);
The above code is using the twitter4j library to get the tweet info. I will be posting more on this later. What is important to take home here is that it is comparing two Java Dates to check to see that the data to be processed is not already in the db.
Simple REPAST colour ramp
The following is a example of a simple colour ramp utilizing Alpha Chanel values restricted to a specific range. The code is used for dynamic generation of the ramp at each step of the model. Make sure to import uchicago.src.sim.gui.ColorMap.
public ColorMap setColourRamp(int range){
ColorMap ramp = new ColorMap();
Color c1 = Color.red;
for (int i = 1; i < vectors; i++) {
float ratio = (float)i / (float)vectors;
int alpha = (int)(255 * (float)i / (float)vectors);
Color c = new Color(c1.getRed(), c1.getGreen(),
c1.getBlue(), alpha);
ramp.mapColor(i, c);
}
Return ramp;
}
Pow instant dynamic ranges.
Malaria, why should we care?
In a few hours I am giving a brief talk at the University of Waterloo ENV Charity Ball. This years charity is the Canadian Red Cross project Malaria Bites. As I am actively working the Malaria in the Amazon project I was asked to say a few words about this cause is important. I like to give my talks unscripted but sat down this afternoon to put a few ideas to paper. During this process I was taken back by some of the numbers I glance at daily from my own simulations as well as global indicators. According to the WHO as of 2006 there were over 3 billion individuals at risk of infection from the plasmodium causing Malaria and approximately 250 million recorded cases. Sadly around one million people die every year most of whom are 5 years or younger. Looking at the mortality of Malaria I think that most people are underwhelmed and do not sense a dire need for action. Tonight I plan to stress the social and financial implications of acquiring Malaria. Imagion yourself as a provider for your family, you live in a third world nation and contract malaria. Suddently you are stricken by symptoms that either diminish or totaly negate your ability to provide for your family. Sustainability at this point is fractured and courses of action such as taking your children out of school must be done to help the family survive. We as western nations talk much about creating a equal playing ground for all nations to develop and promote sustainability, here lies a problem that prevents such goals. So today we look to $7 donations that provide a basic but effective means of defence against malaria infections. The Malaria Bites program will provide a bug net, and education about its use to african families. Malaria and the vectors that carry it adapt to new drugs and defenses that we use and for this project to work we need to attack the problem full on. A half hearted attempt will actualy make the problem worse breeding tolerences into the populations against our cheapest and most effective defenses.
It was not so long ago (prior to the 70’s) that North America and Western Europe were effected by Malaria. Just because rich countries have declaired victory does not mean we should be giving up. It is the countries who need relief the most where Malaria is the worst, the worlds poorest.
ESA Earth Explorer 7: Can BIOMASS survive this round?
A little space news from yesterday, three projects have moved into feasibility testing at the European Space Agency. The winner of this user driven competition will be launched in 2016 as the Earth Explorer 7 mission. This news hits home as the CoReH2O (Cold Regions Hydrology High-Resolution Observatiry) project has been selected to progress to the next stage. One of the PIs on this project is none other than the University of Waterloo’s Dr. Claude Dugay. Its great to see UW making a name for itself in the international remote sensing community. CoReH2O will attempt to provide high-resolution data about water stored as snow on the earths surface. It will use 9.6 ghz (X Band) and 17.2 ghz (Ku Band) synthetic aperture RaDAR in a sun-synchronous orbit to collect snow water equivalent (SWE) observations. This mission will appeal to researchers considering climate change, energy balance, and water cycling (Snow/Glaciers/Ice/Sea Ice).
The other two projects being considered are BIOMASS a p-band polarimetric system for estimation of above ground biomass and PREMIER a project looking to quantify atmospheric processes using infrared and millimeter-wave emitted Radiation. My personal research would typically have me throw my support behind a project such as BIOMASS but I think that current NASA/JAXA projects for biomass estimation could fill their proposed research gap before their satellite ever sees production. Nasa’s soil moisture active and passive (SMAP) project has considerable potential for secondary research tasks including biomass estimation. It will use a relatively long wavelength (15-30 cm) in the L-Band frequency range (1.2 ghz). The Japanese space agency has been using L-Band with PALSAR for some time now but data availability is quite restricted. Granted there is no (not that I could find) availability of space borne P-Band observations other this project. I can see benefits of a longer wavelength for global estimates of biomass but why not just aggregate up higher resolution estimations from an L-Band system? I am no expert on the interaction properties of P-Band and thus cannot comment on the specifics but it can be assumed that due to its 1 m+ wavelength it will have low spatial resolution. Additional NASA’s DESDynl that will provide simultaneous L-Band interferometric synthetic aperture radar (InSAR) and LiDAR is being developed with medium scale biomass applications in mind. I can understand the desire for a dedicated bird for daily, monthly, and yearly estimations because it is doubtful that the other mentioned satellite groups can be convinced to run dediciated passes for biomass estimation. I guess the questions comes down to temporal scale, do we really need the low revisit times BIOMASS would provide when PALSAR, DESDynl, or SMAP could be used to great aggregate products at higher temporal scales? This will be debated and decided in the next months and will be an interesting argument to follow.
The origional post from ESA about the canidate promations can be found here.
Batch REPAST Part 1: Serial slowness
Saturday February 28th 2009, 1:37 pm
Filed under:
REPAST
I have been doing a lot of batch runs of the current malaria model in an attempt to sweep through a series of parameters. When it works I can leave my workstation (or the cluster) to rock away at a large range of parameters and eventually refined sub sweeps. As of now we have three options for running these parameter sweeps. The first and the easiest to implement is repasts integrated batch run system. A parameter file is passed to the model which throws the BaseControler into batch mode initiating a serial process of single thread runs. Not the quickest way to approach this problem but certainly the automation is a benefit. This is very much a “set and forget” option (unless you find some gnarly memory leaks as we’ve had previously). Working REPASTs batch mode you will quickly become familiar with the BaseControler. One can check if the model is running in batch mode by calling:
getController().isBatch();
A true return indicates that the model is in batch mode. A true return should kill any graphics you have running including graphs and visualizations. Unfortunately this does not seem to be the case and the boolean return should be used to suppress any graphics you have running. For example:
if(!isBatch){
buildDisplay();
displaySurf.display();
}
During my model build displays are controlled in their own functions so supressing their initialization makes sure that batch mode is run properly. This becomes very important when we are working on SHARCNET clusters to avoid loosing time in the queue. The added benefit of killing your visualizations is massive gains in processing time. No one ever pegged REPAST as the best option for visualization that is for sure. The last bit of prep for batch running is to create and pass a parameter file detailing inputs for each run. Here is an example of a parameter file:
runs: 1
ClimateChange {
start: -5
end: 5
incr: 1
}
The first line is a bit deceptive as it does not detail how many times the model will run but rather how many times the parameter sweep below will run. The line “runs” should be set to larger numbers if you are attempting to complete monte carlo simulations. Next up is the internal variable you are looking to change. This variable is a modifier to the temperature at each time step of the model. Start, end, and incr define the sweep itself and can be specified as needed. In this example the model will run 11 times incrementing climatechange from -5 to +5 in 1 Kelvin increments. You need to provide a get and set function within the main model for access to the parameter by the controller.
public double getClimateChange(){
return climateChange;
}
public void setClimateChange(double change){
climateChange = change;
public String[] getInitParam(){
String[] initParams = {"ClimateChange"};
return initParams;
}
}
The model has a lot more input params but for the sake of this I will only show the one. Save the param file up as you see fit and get ready to pass it to the model.
public static void main(String[] args) {
SimInit init = new SimInit();
MiaModel model = new MiaModel();
if (isBatch){
init.loadModel(model, "data/params", true);
}
else {
init.loadModel(model, null, false);
}
}
This if the model is in batch mode will take the prams file and load it into batch mode. The boolean modifier on loadModel tells the controller to run in batch. The model will now be running in batch mode running the model as a sequence of serial runs. Make sure are initializing random number generators and other context specific material in buildModel() and not setup(). There will be null pointers flying all over the place if you fail to do this. Its also worth mentioning that if you do not properly close out a model run you will run into serious issues. This make sure you are making the controller aware the run is complete I do so as follows:
if (climate.size() <= (int)getTickCount()){
getController().stopSim();
}
In my post step the model checks to see if there is any climate data left. If not the model calls to the controller to stop. Next post will get into details about how batch runs can be processes in parallel across core and nodes.
Model Time Step to Date Coversion
Saturday February 28th 2009, 3:41 am
Filed under:
Java
In models that involve a time series knowing what the date is at each time step is important. Today I was dumping results to a database and needed a way to convert the time step to a date for labeling purposes. For the viewers of my upcomming presentations axis labels of 0,1,2,3,n will not convey a message properly. Now I had done this once before in c# but after a small survery of the web I was unable to find a sold code example.
The Problem: Need to convert the models time step (Integer) to a a string mm/dd/yyyy.
At first thought I was daunted at the task of handing a function aware of how many days existed in specific months or years. Fortunately solution it self is actually quite simple. It only requires that you know either the exact starting date or ending date of your model. First we will import two libraies
import java.text.SimpleDateFormat;
import java.util.GregorianCalendar;
SimpleDateFormat allows for formatting from data to text or text to date. This is extremely flexible and can be tailored to your formatting needs (see the javadoc for examples). GregorianCalendar is a subclass of Calendar specific to the standard calendar of most of the world. The class itself allows for conversion between date objects and integer fields for year, month, date, hour, ect. To take advantage of these classes we initialize each of them in the code.
DateFormat df = new SimpleDateForma("ddMMMyyy");
GregorianCalendar modelDate =
new GregorianCalendar(1990, 01, 01);
A date formation is set up in the manner I need for my database lables and a new calendar is initalized. The calendar is not initialized to the actual date but rather is forced back to the beginning of my model at 1-1-90.
My model stuctures currently dump data to the db at the end of each step so its safe to advance the date post data recording. To do this I do the following:
modelDate.add(modelDate.DATE,1);
The DATE variable is advanced by one, if your time step is different change this variable accordingly. There is no need to keep track of the other variables as the calender will tick forward the year and month on its own. I then simply write to my db using the dateformat created previously
miaDB.setCellValue(df.format(modelDate.getTime()));
As always I am sure there are better ways to do this but this works and I’m not going to mess with it. I might just post up some results tomorrow for discussion.
Canadian Association of Geographers
Saturday February 28th 2009, 3:22 am
Filed under:
Confrences
Word came down from my advisory today:
“I just realized the deadline for CAG abstracts is tomorrow. If you wish to submit an abstract, please do… If you are unable to attend the conference in Ottawa, I can present on our behalf.”
Skip out on a conference? I think not. Tomorrow is now today and I have an abstract completed. The conference is in may so I suspect the model will be officaly completed by that point. None the less no major specifics are given in the abstract to protect myself in the event things have not progressed as such.
Agent-Based Simulation of Malaria in the Amazon
Recent malaria reemergence, epidemic transition, and ensuing low transmission endemic in Iquitos, Peru reveals an interesting case used to explore the spatial dynamics of malaria transmission. In this region of the Peruvian Amazon, climatic change, demographic instability, and landscape fragmentation are amongst a unique set of local spatial variables underlying transmission dynamics. Traditional non-spatial and population based epidemiological models lack the ability to resolve the interactions amongst these variables. Agent-based models present a new opportunity for spatially explicit definition and exploration of causal factors and influences of transmission between mosquito vectors and human hosts. A framework for simulation of malaria transmission has been developed using the RePAST agent simulation toolkit. Three interacting sub-models representing human decisions, vector dynamics, and environmental factors interact to simulate transmission of malaria. The resulting emergent behaviours offer incite into the epidemiological events of the past two decades in Iquitos, Peru. Additionally, considerations will be given to potential model validation and agent handling strategies.
I have not hit the submit button yet as I am sure I will want to make some last minute changes. I will be making a poster for the interdisciplinary center on climate change sessions at UW next week. I don’t enjoy making posters but I made a commitment and need to stick to it. I wonder if its worth my while to use Latex for a poster.
Neighbour Problems: Weighed Agent Travel
Saturday February 28th 2009, 12:54 am
Filed under:
Java,
REPAST
A fine $3 departmental breakfast has my stomach full and my mind in a working mood this morning. A trip to the dentist in a few minutes is sure to cure me of this happiness.
First on the agenda was to get a weighted walk system working in the ABM. This system would be in charge of the random walks that the mosquito agents take to find humans to bite or water to lay eggs. Previous research has shown that while the mosquito movements are somewhat random this particular sub-genius avoids forested areas. A relatively simple problem but the issues of computational overhead become compounded when each agent needs to make a weighted travel decision 70 times per step. I think we can all imagine what might happen if 70,000,000 weighted walks per step might cause. Instead of doing this a weight map will be assigned to each cell at the start of the model which can be read back by the agents. Firstly the Moore case neighbors must be calculated at every cell (Below p and c represent x and y coords of the current cell as all cells in the grid are iterated through).
List habitatCells = new LinkedList();
for (int m=1; m>=-1; m--){
for (int n=-1; n<=1; n++){
if (m!=0 && n!=0){
try {
habitatCells.add(habitatSpace.getObjectAt(p+n, c+m));
} catch (IndexOutOfBoundsException e) {
habitatCells.add(null);
}
}
}
}
}
Here we are moving through each neighbour of the current cell. The center cell is excluded. The value of each neighbour is added to a list. For edge pixels out of bounds locations are marked with a null in the list.
for (int a= 0; a<habitatCells.size(); a++){
if(habitatCells.get(a)==null){
directionalWeights.add(a, 0);
}
else{
if (
Integer.parseInt(habitatCells.get(a).toString())==For){
directionalWeights.add(a, 10);
}
else if(
Integer.parseInt(habitatCells.get(a).toString())==Urb){
directionalWeights.add(a, 70);
}
else{
directionalWeights.add(a, 20);
}
}
}
Still in the same structure looping through each grid cell a weighted selector is implemented. WeightedSelector is a function that is somewhat reminiscent of CERN’s WeightedRandomSampler. The difference is that it will accept types such as linked lists as the bins. This is important for passing the selected weighted random direction back to the agent for movement. Each element of the neightbour vector is assesed and a weight is assigned based on its coverage type. If the coverage type is null (case for out of bounds) then a weight of 0 is assigned and the agent my not travel there. When ever the agent wants to move now all it needs to do is call for weightedDirection(x, y) and it is returned a linked list with new x and y coordinates calculated in a weighted random manner.