Planet DCS@UofT

November 20, 2009

Serendipity

Adaptation just as important as Mitigation

Brad points out that much of my discussion for a research agenda in climate change informatics focusses heavily on strategies for emissions reduction (aka Mitigation) and neglects the equally important topic of ensuring communities can survive the climate changes that are inevitable (aka Adaptation). Which is an important point. When I talk about the goal [...]

by steve at November 20, 2009 08:33 PM

jordi's blog

Modeling tools news - Nov. 20 2009

Recent releases of new modeling tools (in a few cases, "new" may refer to "new to me", meaning that I've discovered the tool just now but maybe the tool already existed):

 

read more

by jordi at November 20, 2009 05:26 PM

Serendipity

Will Peak Oil Save us from Climate Change?

Yesterday, I posted that the total budget of fossil fuel emissions we can ever emit is 1 trillion tonnes of Carbon. And that we’ve burnt through about half of that since the dawn of industrialization. Today, I read in the Guardian that existing oil reserves may have been deliberately overestimated by the International Energy Agency. [...]

by steve at November 20, 2009 04:51 PM

Tech Talk

Academic Genealogy

Just ran my advisors through these two services and came up with this:


by Haz (noreply@blogger.com) at November 20, 2009 10:59 AM

jordi's blog

Methods & Tools - a free software development magazine

Methods & Tools is a free software development magazine with practical knowledge, news and resources on software development: Software Testing, Project Management, Programming (Java,.NET, Ruby on Rails, Ajax), UML, Agile (eXtreme Programming, Scrum, Test Driven Development), Configuration Management, Databases, RUP, Software Analysis, Software Design, Software Quality Assurance, Software Maintenance, Software Process Improvement (CMMI), Software Development Tools, User Interface, etc.

 

read more

by jordi at November 20, 2009 02:39 AM

Animated Database Courseware

I've just found an online Animated DataBase Courseware offering an "Interactive Approach for Teaching the Principles of DataBase Concepts", including topics like database design (ER notation, transformation rules for deriving database schemas from ER models, normalizations and functional dependencies concepts,...), sql, transactions and security.

All this is complemented with proposals of exercises (solution is provided) and completely free.

 

read more

by jordi at November 20, 2009 01:14 AM

Does Microsoft still believe in model-driven development? - the Oslo move

When Microsoft started Oslo (in short, Microsoft strategy for model-driven development consisting in the M modeling language + the Quadrant modeling environment + a model repository), we all became very excited. If Microsoft was keen on investing in MDD it was a good indication that they were convinced that they could make some money out of it.

 

read more

by jordi at November 20, 2009 12:46 AM

November 19, 2009

Serendipity

Engineering the Software for Understanding Climate Change

Our paper, Engineering the Software for Understanding Climate Change finally appeared today in IEEE Computing in Science and Engineering. The rest of the issue looks interesting too – a special issue on software engineering in computational science. Kudos to Greg and Andy for pulling it together.

by steve at November 19, 2009 09:01 PM

One Trillion Tonnes of Carbon

I posted a few times already about Allen et al’s paper on the Trillionth Tonne, ever since I saw Chris Jones present it at the EGU meeting in April. Basically, the work gets to the heart of the global challenge. If we want to hold temperatures below a 2°C rise, the key factor is not [...]

by steve at November 19, 2009 03:16 PM

November 18, 2009

Ainsley C. Lawson

First evaluation result

I have a (pretty terrible) result: with the way things stand right now, tracSnap is able to predict 16% of the people who comment on tickets related to Hadley model defects.

How I arrived at this result:
  1. Jon gave me a list of tickets (defects) that he was able to associate with a specific revision of the repository.
  2. For each such ticket, I looked at the files that were involved in its associated revision, and used tracSnap to get the "experts" for those files.
  3. I compared this list of experts with the people who actually helped to fix the defect, and checked to see if they are the same people.
I have a hunch that I can improve this statistic by finding a better way to relate changes in the branches to the files in the trunk. Right now, if you make changes in your branch, you are an 'expert' on only the files in your branch, and not the corresponding file in the trunk. So if you aren't the one to merge your changes to the trunk, you never get associated with the trunk files, and thus will not get suggested as an expert.

As of yet, I've only looked at the prediction rate using the "experts", and not the ticket reporter's "suggested contacts". I suspect that both analyses will produce similar results, since the suggested contacts and experts are not independent (contact suggestions rely partially on the expertise calculations).

16%... lame.

by noreply@blogger.com (Ainsley Lawson) at November 18, 2009 07:11 PM

The Third Bit

Special Issue

A special issue of Computing in Science & Engineering that Andy Lumsdaine and I edited, devoted to software engineering in computational science, is now available. We’d like to thank everyone who contributed:

Computing in Science and Engineering

by Greg Wilson at November 18, 2009 06:32 PM

Zuzel.vp at UofT

zuzelvp

Today I am readying a paper that describes in detail an interesting study in which 120 computer science students rated 100 small code snippets (7.7 lines on average) from 1 to 5 according to their readability. The dataset and the tool used for the study are available. Some of the results of this study are [...]

by zuzelvp at November 18, 2009 05:16 PM

The Third Bit

jordi's blog

What we actually know about software development by Greg Wilson

Today seems to be "the recommendation day". For those of you that do not follow (yet) Greg Wilson's I'd like to recommend Greg's presentation "Bits of evidence: What we actually know about software development and why we believe it's true" where he discusses the low standards for proof in software engineering (too many claims without any kind of empirical validation) and how we can start addressing this situation.

 

read more

by jordi at November 18, 2009 12:27 AM

November 17, 2009

jordi's blog

Don't miss Johan den Haan series of posts on basic concepts of MDD

Johan den Haan has a very interesting blog on MDE topics. Recently he has been publishing a series of posts on basic model-driven development principles/misperceptions/challenges that I'd like to recommend you. Take a look at my selection of posts (useful specially if you are new in the area of MDE):

 

read more

by jordi at November 17, 2009 11:50 PM

First dollar with the Amazon Affiliate program

When I started this portal I joined two affiliate programs:

 

read more

by jordi at November 17, 2009 11:10 PM

The Third Bit

Evolution in Action

It turns out that a human lifetime may in fact be long enough to see a new species emerge. Cool.

by Greg Wilson at November 17, 2009 04:15 PM

Serendipity

Embodied Social Proxies

While at Microsoft last week, Gina Venolia introduced me to George. Well, not literally, as he wasn’t there, but I met his proxy. Gina and co have been experimenting with how to make a remote team member feel part of the team, without the frequent travel, in the Embodied Social Proxies project. The current prototype [...]

by steve at November 17, 2009 03:26 AM

Michalis Famelis

Evaluation of me as a TA


Last year I was a TA for two terms for the same undergrad course (CSC207 Software Design). I just received the student evaluations for my performance as a TA during the winter term, so now I have a clearer view of my performance during the whole year.

There were 9 and 8 evaluation responses during the Fall and Winter terms respectively, and the students gave me a mark from 1 to 7 inclusive, where 1 stands for extremely poor, 4 stands for adequate and 7 for outstanding. Here are my stats:

Fall 2008 Winter 2009
mean std dev mean std dev
presentation 2.33 1.41 4.38 1.51
English 3.67 1.41 5.12 1.55
clarity 2.44 1.67 5.00 1.07
enthusiasm 2.78 1.79 6.00 1.07
question handling 2.56 2.07 5.12 1.73
grading fairness 3.44 2.07 4.88 1.25
grading speed 3.67 1.94 4.62 1.69
overall 2.67 2.06 5.38 1.41

I think that the stats kind of tell the story of me coming and adapting to Canada. I came last September and for the first term, I really really sucked as a TA. But then, as I slowly became more and more accustomed to the place, the people, the language and the academic environment, things got much better.

I especially note the “enthusiasm” stat. It’s actually surprisingly accurate: the more I got accustomed to this place, the more I started identifying myself with it and regarding it as a place to which I want to contribute. I just feel sorry for the poor students that had to suffer me during my adaptation period…

Posted in blogging, self-reference, teaching

by plagal at November 17, 2009 01:20 AM

November 16, 2009

The Third Bit

When I Said “The Last Twenty Years…”

Last week, in response to Google’s announcement of a new programming language called Go, I said:

I’m underwhelmed: it’s as if the last 20 years of programming language research hadn’t happened.

Turns out I was being generous: read this post from start to finish, and you’ll see what I mean.

So what should a new programming language do to get my attention? First, just as applications should be designed for testability, languages should be too. That means choosing constructs to make the lives of static and dynamic analysis tools better. Building such tools after the fact is like trying to add security to an app after it has been deployed; I think we’d do better to treat the capabilities of today’s leading-edge program analysis tools as hard (but not unbreakable) constraints on what’s allowed to go into a language, and see how far it gets us. I suspect this will push us toward strongly typed and mostly functional languages.

Second, user testing of language features. The folks at CWI did this with ABC (a precursor of Python); Steven Clarke has done excellent work on API usability at Microsoft (see for example this DDJ article from 2004), and there’s lots of other prior art — hell, I did a little myself nine years ago for Python (see these messages for details). I’m not suggesting design by committee [1], but checking to see how comprehensible or surprising feature XYZ is going to be to the average programmer before it’s put into the language just seems like common sense. I suspect this will push us away from pure functional languages: monads are just plain hard, and while purely functional data structures are possible, they’re hardly intuitive.

Third, a new language should explicitly be designed to make the expression of common design patterns as straightforward as possible. Languages (of all kinds, not just programming languages) evolve by formalizing the common usages of the day: idiomatic uses of goto statements become for loops, structs with function pointers become objects, and so on. There’s a tremendous literature on design patterns at several scales; why not treat them as something akin to use cases?

Of course it’s never too late — if someone has the time and energy, they could apply these three criteria to Go (or any other language) right now. Hm… sounds like an interesting thesis topic…

[1] Which gets an unfairly bad rap — both the American Constitution and the King James version of the Bible were produced by committees.

by Greg Wilson at November 16, 2009 06:13 PM

Speaking at CUSEC 2010

As they just announced on their blog, I’ll be speaking at CUSEC 2010 in Montreal in January on evidence-based software engineering (which is a lot more fun than you’d guess from the title).  Hope to see some of you there.

by Greg Wilson at November 16, 2009 05:41 PM

Bend It ‘Til It Breaks

Want to know how strong a piece of steel is? Bend it ’til it breaks. Want to know how usable a programming system is? Make a few deliberate mistakes and see how comprehensible the error messages are. It’s not the only approach, but it’s the one Zef Hemel took with Ruby on Rails. In his original post, he took a critical look at how helpful Rails is when a developer mistypes something. A lot of people misunderstood what he was doing, which prompted a follow-up post; since then, he has tried the same approach with JBoss Seam and Scala Lift. I think this is pretty cool — so cool, in fact, that I’m wondering if there’s a thesis topic in there somewhere…

by Greg Wilson at November 16, 2009 02:16 PM

jordi's blog

Pitfalls of informal modeling languages (cartoon)

John Mylopoulos , in his keynote talk at the ER'09 conference , used the following Far Side cartoon to illustrate the misunderstandings thay may occur when using informal modeling languages: there may be a huge difference between what we say and what others understand!!

 

read more

by jordi at November 16, 2009 01:41 PM

Serendipity

Three posters at the AGU meeting

Our group had three posters accepted for presentation at the upcoming AGU Fall Meeting. As the scientific program doesn’t seem to be amenable to linking, here are the abstracts in full: Poster Session IN11D. Management and Dissemination of Earth and Space Science Models (Monday Dec 14, 2009, 8am – 12:20pm) Fostering Team Awareness in Earth System Modeling [...]

by steve at November 16, 2009 03:21 AM

November 15, 2009

Catenary

Bad surveys


Yesterday, a grad student asked me whether I could answer a survey he was doing for his research. I’ve struggled getting participants in the past, and seeing that the survey would only take a couple of minutes, I accepted.

It was a survey about the urban design of a particular place at the University of Toronto, which was fine with me. Halfway through the survey, though, I realized he was trying to put some answers in my mouth. He asked me whether I liked that place, and when I said I did he replied: “Really? There’s nothing to like there.” I insisted that I liked the landscape around it; he objected, pointing out that it was just a grass field. I kept insisting, and he grudgingly wrote down my answer.

We went through this process several times. The last straw fell when I said the lighting at that place was good and he responded that “it’s pretty dark there right now,” circling the “bad lighting” answer. He only erased that answer after I lost my patience and told him that these were my answers and if he wanted others he should ask someone else.

I know how this survey’s results will look like. Some large percentage of users of this space, it will say, are terribly dissatisfied with it –hence providing support for whatever project this student is designing. What gets me is that results from a survey as poorly and dishonestly executed as this one will carry greater weight than any non-quantitative arguments simply because they produce a percentage number in the end. We’re in love with quantitative evidence, no matter how poorly it is constructed.

As I left the place that evening I looked around with a critical eye. There were definitely some areas that could be improved. Come to think about it, I thought, it was plain to see that lighting was actually pretty bad — and no survey results will convince me otherwise.

 

by Jorge at November 15, 2009 05:58 PM

November 14, 2009

Semantic Werks

My sample Google/Microsoft interview question

As in, one I have to deal with, as opposed to being asked.

Let’s say you want to calculate the powerset P of a list of items. For each subset of P you will do a non-zero amount of work, so we would like to make P really small. The size of P is … 2^N. Meaning of course for any useful input, P is enormous, heat-death of the universe type size.

We have some function f(s) that takes a subset member  of P (a set) and returns True or False. If True, we want to filter out all the subsets of the subset set.  E.g., if f([1,2,3]) = True, we would like to remove [1,2], [1], [2] etc. from P (so we don’t evaluate them as well). So the question is, what is an efficient way to do this? The naïve solution is to generate all the members of P, then iterate over them to remove the members that are subsets of a solution. But of course, for large N, this is infeasible to store and to do. Consider the case where the member we find is the set of all items. In this case we should stop our loop. A good solution will allow us to do this in constant time (and not by using a special case).

Update: I should make it clear that f([1]) = T and f([2]) = T does not mean f([1,2] = T.

Solutions I’ve pondered include using a bit matrix and bit masking, but I can’t see how to do that in constant time. We might also represent the subsets using a tree with related by subset, which would probably be log or sublog time.

One question I have is whether these questions are always checked for solvability. I mean, it is relatively easy to pose a question that is not solvable in deterministic polynomial time, and just as easy to pose one whose answer cannot even be checked for correctness. That would be a rather unfair question, wouldn’t it? Like, find a linear time algorithm for finding a truth assignment to the following formula. If you can solve that you should probably start your own company.

Related posts:

  1. Notes on implementing EVF in Shrimp
  2. Not the Answer, but the Question
  3. RSS/Atom add-on

by Neil at November 14, 2009 01:38 PM