Saturday, February 28, 2009

Juno: Movie review

(Wikipedia)

Juno is the story of a teenager who is pregnant. In 2007 when it came out, there was much controversy, as everyone on the pro-life/pro-choice/teenage pregnancy/values issues had a go at it. But I never saw it until now.

I was all expecting it to be one of those movies with the spunky teenage girl and clueless adults around her. But it was not. It is a teenage girl, dealing with the world through her mistakes and with the help of people around her.

Juno is a high school girl, who has gotten pregnant. The other major characters are the boy (surprise!), her best friend, and her parents. While initially she was going to have an abortion (apparently, her best friend had helped a few other girls in their school through the process) Juno changes her mind at the clinic and tries to give up her baby through adoption. The couple that is planning on adopting the baby become the other main characters in the movie.

Instead of the sassy teenager, Juno is introspective and comes to terms with what she recognizes as a mistake. And her focus is on dealing with it. There are others around her like the school nurse, some of the other classmates who look down on her. But those closest react differently. Her parents first reaction is disbelief, mostly blaming themselves for being bad parents. But they too, after first beating themselves up, switch into dealing with the situation, getting Juno through her pregnancy and the adoption.

What I liked about it was how they handle adversity. Yes, Juno and her parents (and the father) are hard on themselves for messing up. But the focus is on what has to happen next. The parents making sure that lessons are learned. The daughter learning that her parents are paying attention and care about her. And they guide her, with what wisdom they have. And in the end they have her respect.

None of the principles thinks they have it all together. Just people who have made mistakes, and need to figure out life. And in a movie, that is good.

Sunday, February 22, 2009

Beaver Falls Coffee and Tea: Guatamalan Huehuetenango

Beaver Falls Coffee & Tea Blog

Beaver Falls Coffee & Tea is a coffee shop nearby Geneva College in Beaver Falls, PA. It was started by a couple of alumni who just didn't quite leave, basically the stereotypical college coffee shop. But last year they have started the next step, roasting the beans. We occasionally get beans from them, and this time we got some Guatemalan Huehuetenango (which were Direct Trade).

Why is this something of note? It turns out, like most food, coffee is time sensitive. The green beans are shipped from there origins, but all the steps after that effect the taste, starting from when they are roasted. It matters how long from roasting they are brewed, it also matters how long between grinding and brewing,and of course, how long they are exposed to direct heat (i.e. either take the coffee off the brewer and pour it in a thermos within 1/2 hour or dump it.)

So a year ago, Beaver Falls Coffee and Tea started roasting. The other roaster in the Pittsburgh area is La Prima, but they have no outlets in the North Hills (I know, I've asked their marketing director and after much discussion among the staff, came up blanks.) So we made a point of using them, even asking them for different style of roasts to try the range of roasts (light to dark).

We got the Guatemalan this week. It is a medium-light roast. Based on prior experience, we French-pressed it at 7 minutes (for most coffees, we French Press for 5 minutes). At this point, it was still a little light, but we were at the point where the bitterness in the bean was coming out, so I think we have the brew time just right. But the beans were probably too lightly roasted for my tastes.

The dark roasts are probably easier to quality control and these have gone well (which is why if you get coffee at places that ship their roasted beans from somewhere else, you should get dark roasts). The medium and lighter roasts have more nuances, but our experiences with Beaver Falls Coffee & Tea is that they are inconsistent, and their mediums and lights are often too light (not as much taste)

UPDATE: Batch 2 was this afternoon. This time I ground the beans to medium grind (rather then coarse like I usually do for French Press) and had a bit more beans in there. Better. I get more of the taste. Again, at 7 min, I can taste just a hint of the bitterness that tells me that I better not let the beans brew any longer. Of course, now I have to use a second filter to filter out the grounds, but this mostly works. But we'll probably make a point to get medium-dark roasted in the future.

Thursday, February 19, 2009

Of Cowards and Conversation at the New York Times

Of Cowards and Conversation, as summary by Eric Etheridge

I'm in a hotel room, very tired, listening to CNN No Bull with with Campbell Brown. And I'm writing, very dangerous to write in public while tired, especially if race relations is within spitting distance of the topic.

The setting is two-fold. The New York Post has an editorial cartoon satarizing the shooting of a chimpanzee who mauled a child in Connecticut, depicting the chimpanzee as the economic stimulus bill. Al Sharpton and others are railing against it claiming the chimpanzee is actually representing a black man, namely President Obama. The other event was Attorney General Eric Holder gave a talk where he claimed that the U.S. “in things racial we have always been and continue to be, in too many ways, essentially a nation of cowards.”

My feeling, in short, is the fuss over the cartoon is a farce in itself, and that Holder is right. Because it is incredibly difficult to have an honest conversation about race relations. I'd say I've only been part of one, at an alumni board meeting at my graduate department. Like real conversations there were opinions, even disagreements,. And like good conversations there was exploring, depth and the exchange and understanding of points of view. And an example of the dielectic. When this particular conversation was over, one of the more vocal people there openly commented that what we just did was shocking because it was an honest conversation about race relations. Left unsaid was the fact we probably could not expect to experience another one anytime soon. It is very easy for conversations to become a self-justification session, which tends to ruin the whole thing. (most conversations about people trying to announce their own sanctification are rather boring)

My wife and I have accepted the fact that if we have kids, they may not follow our paths. And our hope (well, certainly mine) is that in whatever direction our kids go, we will have older people in our circle who can be with them, so they may have a mature example of whatever it is they do. But in my own life, I've been noticing that my circle is less and less broad in scope. It's not married life, I've just don't have the energy to explore that I used to. And as big as my world is, the only way I will add to it is if it is indeed complementary to what is there.

These, like many choices, is a decision that is made. In this case, openly and honestly. It has the advantage there are some points that have a very big point of entry. But it is a reality that I don't have as broad a circle as I used to. And the world that I can share in life is smaller then it used to be. And that is a bit sad.

Sunday, February 15, 2009

QOTD: We did not mean to go to the opera on Valentine's Day

(The following is a very loose paraphrase)

K: What did you do yesterday
L: Well, we went to the opera. You see, when S asked me about Saturday or Sunday, I just picked Saturday night without looking at the calendar.
S: I kept on telling you "Are you sure?"
L: You can't just hint, you have to spell it out!
S: So, we went to the opera on Valentine's Day.
L: We've been so good about never doing anything on Valentine's day

K: That is so you

Monday, February 09, 2009

Fresh Air talks about Afghanistan - February 4, 2009

Fresh Air is one of my favorite radio programs. It is a show of interviews. And Terry Groves is a good interviewer, who has the ability to ask the questions at the heart of the matter, and uses the interview to draw out the story the interviewee has to tell.
This episode revolves around the current campaign in Afghanistan and centers around two interviews. The first interview is with Sarah Chayes (Taliban Terrorizing Afghanistan), a former NPR reporter who was assigned to Afghanistan in 2001, and has stayed there to run an NGO (Non-Governmental Organization) that is establishing an industry in Afghanistan. The second is with Ahmed Rashid (Taliban Activity Up In Pakistan), a Pakistani journalist. Terry Groves asks them about the conditions they see in Central Asia, what they think about the U.S. troop increase promised by President Barack Obama, and what they think about the future.

Some highlights:

  • Both of them are highly supportive of the idea to add 30,000 additional American troops into Afghanistan.  The biggest issue they both see is the lawlessness and lack of security, and 30,000 additional American troops will go a long way.
  • Both of them are almost fawning in admiration for U.S. General Petraeus, the new head of the U.S. Central Command, for his leadership in Iraq as the head of U.S. forces in Iraq, and for Petraeus' understanding of counter-insurgency.  Listening to them, it reminded me of a Sargeant who unshamingly praised General Patraeus as a scholar-warrior, and this was when Petraeus was just starting in his post in Iraq.
  • Cheyas sees as the most difficult problem in Afghanistan is the corruption in the Afghanistan government. And she sees President Karzai as part of the problem.  It is not just the fact that there is corruption.  It is the severity (not extent) of the corruption that is wearing out the people.  And they remember that it was the corruption among the rulers that made people welcome the Taliban many years ago in the first place.
  • Both of them blame the previous American administration for not pressing the Afghanistan or Pakistan governments to improve in governance all this time.  That one reason for the lack of progress was because no progress was required, the Americans would support the Afghan and Pakistani governments regardless of their progress.
I especially find it interesting that both welcome the additional U.S. military forces in the region.  They both see the U.S. military as the most capable and beneficial force in the area.  And one that has learned much over the past few years.  But they are few in number, and they need other expertise to help the real issue, improving the Afghanistan ability to govern themselves.  Which is not something that is reasonably expected to be a job that goes to the U.S. military.  (of course, they get that job like all other jobs.)

Friday, February 06, 2009

Book Review: Desktop GIS: Mapping the Planet with Open Source by Gary Sherman

Desktop GIS: Mapping the Planet with Open Source Desktop GIS: Mapping the Planet with Open Source by Gary Sherman

My review

rating: 4 of 5 stars
Desktop GIS covers Open Source software for use as a Geographic Information Systems (GIS). In particular, it covers the following programs and libraries:

- Quantum GIS
- uDig
- GRASS(and its Java front end JGrass http://www.jgrass.org)
- PROJ.4
- GDAL/OGR
- PostGIS
- FWTools
- GMT

At the beginning of the book, the author outlines three classes of users. A casual user who only needs to look at data found from elsewhere, an intermediate user who visualizes but also creates or converts data, and an advanced user who has the need to do spatial analysis. To cover all of these is an ambitious goal, which is further diluted by the authors felt need to cover the entirety of open source mapping in one book.

What the author does well is to identify tools, and explains what can do what with enough to get you started. So Quantum GIS and uDig are the viewers, able to read almost any GIS format (in particular the readily-available ESRI Shapefiles as well as PostGIS). GDAL and OGR that can convert anything to anything (including delimited text. These are often distributed as FWTools). PROJ.4 that converts projections from one to another (and is embedded in everything). PostGIS which is the spatial database that enables spatial analysis. And GRASS, which is the full-fledged can-do-everything-but-is-hard-to-learn tool. And then some random programs that either do something completely different (e.g. OSSIM, which is an imagery analysis tool) or can make a picture of a map with lots of options (e.g. GMT).

What he provides are the basics for the casual or intermediate user. The advanced GIS analyst would only have a taste of what GRASS can do, but would not know what can be done with it. Similarly, while the intermediate user will get a sense of what PostGIS can do, the lack of space to cover spatial extensions to SQL supported by PostGIS loses its value to the advanced user. What could have made this book better was more focus. If the author was compelled to have a survey of all open source mapping, it may have gone into an appendix with a few paragraphs for each of the miscellaneous tools. But for the book, one good focus would have been the GIS stack comprising of data storage, data analysis and data viewing. Basically the open source counterpart to the ArcGIS/Oracle with Spatial Extensions. And everything that does not play a role in the stack, gets pushed into the appendix.

What could have been done? The book tended to be organized by tool. But once past the casual user, almost all tasks required multiple tools. I would have gone:

1. Viewing data (raster, vector, introduction to QGIS, uDig, GRASS)
2. Converting data/data formats (GDAL/OGR, Maybe PROJ.4)
3. Creating/Editing data (digitizing, importing)
4. Spatial databases (PostGIS)
5. Geoprocessing/spatial analysis - GRASS, PostGIS, R-spatstats
6. Tools integration (QGIS/uDig with GRASS/PostGIS)
7. Scripting
8. Customization

and everything that did not fall into this gets a page in an appendix.

Knowing where to start is a big help. Most of the websites either focus on one product, or try to teach everything as being equally important. Gary Sherman at least identifies the main building blocks. (QGIS or uDig, GDAL/OGR/PROJ.4, PostGIS, GRASS) and gives enough to get started. And this can be very helpful, so at least the starting analyst knows where to start.

What would be next? For the person still working within the GIS stack (as opposed to a completely different topic, like imagery analysis which is OSSIM's territory) there are a few obvious topics.

1. PostGIS - Spatial databases with SQL. Maybe even connections with ArcGIS. Even a short (10 pg) appendix would have done wonders here.
2. GRASS - The author devotes an additional appendix to this. But to do this right, you probably need to refer to Open Source GIS: A GRASS GIS Approach
3. R spatial statistics packages. Applied Spatial Data Analysis with R would cover this.

Mostly, a good book to get started in GIS using Open Source tools. Casual users would be well served. Intermediate users would get started and can find the rest using the internet. Advanced users are going to miss alot (to the point they don't even realize that these tools were a worthy alternative).



View all my reviews at Goodreads.

Wednesday, February 04, 2009

Loading lat/long data into PostGIS

Base scenario. I have a datatable with longitude and latitude data that I want to look at using spatial tools. And I have shape files for the metropolitan area that I want to use them with. So, in other words I have the following.

  1. ESRI shapefiles
  2. Data table with latitude and longitude

The plan is to load the data table into PostGIS, then I can pull them into a GIS system like uDig or Quantum GIS or various ESRI products. Because PostGIS can do transforms, but ESRI shapefiles are what they are, the procedure is as follows:

  1. Identify SRID (spatial reference ID) for the ESRI shapefiles. This should include units (feet, meters, or degrees)
  2. Identify a corresponding SRID that used lat long degrees
  3. Load data table into PostGIS
  4. Add geometry (points) data to datatable, including transformation from degrees to feet.
  5. Optional: export PostGIS to ESRI Shapefile if needed.

1. Identify SRID (spatial reference ID) for the ESRI shapefiles. This should include units (feet, meters, or degrees)

Shapefiles usually come as a set of files inside a directory. When they come from an official source, one of these should be a *.prj file. This is an example of one

PROJCS["NAD_1983_StatePlane_Pennsylvania_South_FIPS_3702_Feet",
GEOGCS["GCS_North_American_1983",
DATUM["D_North_American_1983", SPHEROID["GRS_1980",6378137.0,298.257222101]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433]], PROJECTION["Lambert_Conformal_Conic"], PARAMETER["False_Easting",1968500.0],
PARAMETER["False_Northing",0.0],
PARAMETER["Central_Meridian",-77.75],
PARAMETER["Standard_Parallel_1",39.93333333333333],
PARAMETER["Standard_Parallel_2",40.96666666666667],
PARAMETER["Latitude_Of_Origin",39.33333333333334],
UNIT["Foot_US",0.3048006096012192]]

The first line identifies the projection, using something that should approach a WKT ("Well Known Text") name of a projection. In PostGIS, when the PostgreSQL database was geographically enabled, a table "spatial_ref_sys" was created that has a list of all SRID with names. Searching on this table should reveal a projection that has a name very similar to this one. That will define the target SRID for the whole project. The quick way to do this is to write a query:

SELECT srid, auth_name, auth_srid, srtext, proj4text
FROM spatial_ref_sys
WHERE srtext LIKE '%Pennsylvania%'
ORDER BY srtext;

In this case, we find:

2272; "PROJCS["NAD83 / Pennsylvania South (ftUS)", GEOGCS["NAD83", DATUM["North_American_Datum_1983",SPHEROID["GRS 1980", 6378137,298.257222101, AUTHORITY["EPSG","7019"]],

Note that in this case the key parts were NAD83, Pennsylvania South, and ft.

2. Identify a corresponding SRID that used long lat degrees
Another search through the "spatial_ref_sys" table finds only projections that are associated with feet or meters. So a broader search should be used to get


SELECT srid, auth_name, auth_srid, srtext, proj4text
FROM spatial_ref_sys
WHERE srtext LIKE '%NAD83%'
ORDER BY srtext;

This eventually reveals one projection that

4269;"GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]], AUTHORITY["EPSG","6269"]], PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]], UNIT["degree",0.01745329251994328, AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4269"]]"

The key here is that the 'UNIT' is "degree" as opposed to "Survey feet" or "Meters".

3. Load data table into PostGIS

First, by this point the PostgreSQL database was created using a template that has PostGIS enabled. I created an SQL file with all the Create Table commands ahead of time. They look like this:

CREATE TABLE table2004
(
case_no character(7) NOT NULL,
week integer,
"month" character(3),
eventdate date,
bg_lat numeric(12,6),
bg_long numeric(12,6),
serial2004 serial NOT NULL,
CONSTRAINT table2004_pk PRIMARY KEY (serial2004)
)
WITH (OIDS=TRUE);
ALTER TABLE table2004 OWNER TO userid;

Key point is that I need something an integer field that is a primary key, or at some points indexing becomes a problem. So I made a serial field to be an index (because my case_no field is a text field so was unsuitable). Another issue is the geographic indexing also likes an OID field, so I set OIDS = TRUE. Most current PostgreSQL documentation mentions that this is deprecated, but PostGIS databases would be an exception.

Next, a geographically (geometry) enabled field is added. PostGIS has a PL/pgsql function named "AddGeometryColumn" that does this. Note that earlier, I established that the target SRID is 2272. Next, because there will be spatial searches, I will create an index on the geometry column.

SELECT AddGeometryColumn('public', 'table2004', 'longlat', 2272, 'POINT', 2);

CREATE INDEX idx_table2004_longlat ON table2004
USING GIST (longlat);

Next, the data is loaded. I had exported the data table to a csv file, then used VIM to surround each line with an INSERT INTO statement so it looked like this:

INSERT INTO table2004 (week, month, eventdate, case_no, bg_lat, bg_long) VALUES( 1,'JUL','07/01/2004', 6666097,40.382649,-79.803297);

This loads the data into PostgreSQL. The next step is to actually create the geographic data.

4. Add geometry (points) data to datatable, including transformation from degrees to feet.

Now that there is a geometry column, and the lat/long information in the database as a source, the point needs to be added as a point. This is done through an UPDATE statement, using the PL/pgsql function "transform" "PointFromText". Note that the SRID that corresponded to a latlong representation was 4269. Also note that POINT is expressed in x, y. So longitude comes before latitude.

UPDATE table2004 SET longlat = transform(PointFromText('POINT(' || bg_long || ' ' || bg_lat || ')', 4269), 2272);

Now, this is done, and the table can be viewed using a GIS viewer that can connect to a spatial database.

Some of the data is messy, in particular, I was not able to get lat/long for all the addresses. So I created a View that removed these from the result (or I get a bunch of points at 0, 0

CREATE OR REPLACE VIEW table2004work AS
SELECT table2004.week,table2004.month,table2004.eventdate, table2004.case_no,table2004.bg_lat,table2004.bg_long, table2004.serial2004, table2004.longlat
FROM table2004
WHERE table2004.bg_lat > 1 AND table2004.bg_long < (-1);

ALTER TABLE table2004work OWNER TO userid;
COMMENT ON VIEW table2004work IS 'Remove 0,0 data from map';

So this provides a cleaner dataset, and it is also accessible from a GIS viewer.

5. Optional: export PostGIS to ESRI Shapefile if needed.

To export from the PostGIS database to an ESRI Shapefile (if it needed to be given to someone who only need to look at it), the PostgreSQL database should be running. Then pgsql2shp can be run. This should be done after creating a data directory for the shape file to exist. From the terminal, cd into the target directory and run

pgsql2shp -f table2004working.shp datatable table2004work

This creates a shapefile with .shp, .dbg and .shx files. Note that you may need to create a .prj file as well, but this can be done from within a GIS viewer.


Note: Much of this was found through the great documentation available at Refractions PostGIS site and the Boston GIS website, which has multiple tutorials on PostGIS and PostgreSQL.

[Edit]  In the 'srtext' field, the PROJCS identifies the SRID as belonging to a 'projection' which is a curved surface put on a flat plane.  When there is no projection, it is a pure GEOGCS (geographic coordinate system) and it is in degrees lat/long