Drupalcon Boston 2008 Day 2
During the sessions and events at DrupalCon Boston 2008, some smaller meetings were taking place in the hallway and in the Birds of a Feather room. I got to the conference late again because we had some additional trouble navigating the Boston roads this morning, so when I arrived, I was part-way though a meeting with Dries Buytaert, Angie Byron, Dmitri Gaskin, Earl Miles, Lynette Miles, Sprout Miles, Károly Négyesi, Moshe Weitzman, Steven Peck, and some others (I'm getting tired of listing people). It was big enough to have a hard time hearing people if they spoke too softly.
The topic of the morning was the unit testing Dries announced he would like to have in place for Drupal 7 during his State of Drupal keynote. The meeting didn't appear to end with anything concrete, but there did seem to be a general consensus that Simpletest could remain something in-use, but an additional level of testing would be needed for checking the interactions between modules.
The discussion also occasionally strayed into the project module where they talked about bug handling and git repository links. Someone raised the idea of automatic cross-posting of duplicate issues. Right now the standard procedure is to mark an issue duplicate and paste a link to the older/correct issue, but pasting the link of the duplicate is often not done in the main issue, so tracking down the duplicate is sometimes impossible. Angie said stuff like that wasn't helping when she tried to go back and find that one issue in a sea of thousands.
Updating and upgrading live sites is what I went to in the early afternoon. It started small, but over-grew the table. People were standing four levels deep away from the table at times. I imagine trying to hear was a problem going beyond that. The big issue was how to get newly configured data from development, testing, and staging to a production site. Some discussion tried to steer to ideas for getting fresh copies of live to developers, but that really wasn't the biggest concern of the group.
Dave Cohen proposed to keep the first 1000 IDs of nodes and taxonomies reserved for developers to use. All new content on a live site would essentially start from 1000. That way, when you create a new node that you want to make your front page, you can make it on your development server, save the configuration to start at that node number that's less than 1000, and duplicate those in the reserved id space on production.
Another participant said she had already thought of the idea, but she's already up to 5000 of the 10000 node IDs she reserved. She also apparently has an elaborate set of scripts for downloading a snapshot of the live site, diffing it to her development copy, and merging the information. She said to accomplish it, she has had to revoke editing permissions to users before she takes the database snapshot for diffing and turn off Apache when she dumps the new database back to the live site with the merged data. It sounded to me like she was the only person in that role of responsibility, so it wouldn't probably scale very well to larger groups. She also had some methodology she used to decide which tables got diffed and how it would be done.
A gentleman from France had a hard time conversing, but got across the point that his group uses an even/odd system of id reservations. It scales (theoretically) infinitely large when compared to the block-style reservation, but I'm not sure how much hackery you really have to do to make such a thing possible.
Ori Pekelman was one person that touched on using SVN for various things. He does a cron to update the development server with SVN about every 20 minutes. He proposed to Dave Cohen doing a code sprint to write something that would automate synchronizing development and live. Dave and I had to agree that we came to some solutions for specific issues people had in the group, but not The Solution that would solve all the problems of incremented database identifiers. The macro engine of the devel module was identified as one tool, but not for most situations where new node IDsIDs and site configuration collided. I suggested perhaps a Drupal diff tool for node table changes might be necessary to tell an import what exactly should be inserted, overwritten, and/or deleted.
I also discovered Dave Cohen had modified one of his workflow module implementations to disallow editing a node in specific workflow states. I mentioned at Classic Graphics we're doing node privacy and field privacy by role and workflow state. His reaction to more field control was hard to interpret. He said he basically hook_form_alter()ed away the $form content since his module was a custom implementation of a content type. It still showed the edit tab, but instead of a form, it showed a drupal_set_message() with instructions on what the user is supposed to be doing instead of form editing. While his solution doesn't require the core hack I have been using at work to node.module, his doesn't seem to allow for proper privacy checks when other modules use the access API. He did mention he thought hook_node_grants() was worth exploring. For example, by adding a core patch to node.module, what we're doing allows the standard access check to return false on something like whether or not to display an edit link in a view. Bryan Stalcup calls it our our explicit deny.
I ended the afternoon with another talk lead by Larry Garfield with Dries Buytaert, Earl Miles, Dmitri Gaskin, Barry Jaspan, Brandon Bergren and others. Much of the conversation was between Larry, Earl, and Dries and a lot of it went over the heads of people like myself who haven't looked at either the new database patch for Drupal 7 or the new Views.
Larry's session covered some of the new PDO features he and Károly wrote and what requirements the new core Views would have at the database layer. Apparently Andy Kirkham did some benchmarking of the proposed DB code and it was at minimum on par in speed with the current database code. It also has some auto magic for query building and backwards compatibility with Drupal 6-style database code. The engine also allows for querying from multiple databases of different types. That means the node tables could remain in MySQL while the cache tables live in SQLite. Postgres isn't written yet, but apparently separate engines are still needed for special handling on database-specific quirks.
Dries thought it was a good idea to make sure the new DB code had functionality to stream blobs from a database. The examples he gave were streaming videos and immediately printing page cache instead of buffering it through Drupal/PHP. Someone also asked about whether that would then allow all files stored in on the filesystem in /files to be in the database so the privacy to them could be controlled at a user level, which naturally could also include any of the other access control modules out there like ACL and node_privacy_byrole. The idea of storing all of /files in a SQLite database received grumbles around the table, which was noted by Dmitri.
Dries grabbed Barry from somewhere to talk about Earl's Schema requirements in Drupal 7. Earl was asking for a structure within the Schema API to allow for schema definitions that wouldn't actually create tables in the database so he would be able to create a kind of virtual table for querying fields out of tables as a table of their own in a view, giving profiles as an example where each profile field would be treated as a table. Earl also wanted to be able to read default values from the schema definitions of all table types. In particular blobs are an issue in MySQL because blobs cannot have a default value and defining one as part of the schema for the definition of how to create the table wouldn't make sense. Barry admitted he was perhaps too strict on the rules, but he would write special cases during table creation to ignore default values on fields where no default should be set. That way Views would have a setting to pre-fill the Views forms with. Postgres apparently fails queries that have a default value set for blob fields rather than ignoring it, which could be a significant problem for a module that tries to create a table with such a definition during its installation.
Barry also strayed off into discussing fields in CCK, and Schema API, specifically CCK field splitting, which I've just recently been dipping my toe into CCK source, starting a patch to nest field groups. Earl at first agreed with me that fields have a good reason for staying all in the same table, but when he realized he used shared fields for images, he backed away from fighting a CCK system where all fields are split away from the main content table all the time. It's also impossible to argue against Barry's note that even on the content types where fields aren't shared, multivalue fields still need their own table. The part I don't know enough about is the discussion that CCK makes a new SELECT query for each field on an un-shared content type anyway. I think that's more of just a design problem that could be resolved with some well thought out code cleanup as opposed to feature destruction, but that opinion gained no traction in the discussion. The discussion seems to have already reached it's decisions in that arena, at least for revision 1, whatever that happens to mean. Earl thought there was a limit to the number of JOINs MySQL can do, but nobody seemed to think that was a big issue. I think unless a technical limitation is found other than the obvious potential overhead for huge JOINs brings, there's enough momentum to proceed with having fields only stored in separate tables. I didn't think it was the best time to mention the cck_table_despliter module I wrote one night after having exploded CCK tables that had hard queries depending on their unified structure.
On a somewhat unrelated note, I removed my laptop from my bag a coupe times at the conference. Most of the time people keep their attention to themselves, but during Larry's BoF, the guy sitting next to me kept trying to read my screen. He couldn't because had my brightness turned down and I use a 3M privacy filter. I've sometimes been annoyed that it reduces the brightness on my screen by a noticeable amount, but it has acted as a nice screen protector when I've accidently spit on my screen while talking or splashed a drink nearby. The last two days it's been nice to have the privacy. I will buy one for every one of my next laptops. I have to admit, I couldn't keep my eyes from wandering to other laptops during sessions and BoFs either. For example, I know Earl was reading an article on CNN.com about Hillary and Obama during Larry's BoF and during Dries keynote, Narayan Newton was talking in IRC a lot about what was going on with the messed up wireless Internet access at the conference center and the slowness of drupal.org from it.
I'm sure a participant of one of the events will get around to reading this. I wrote it all from memory, so feel free to correct me.


thank you for the level of detail!
very helpful, almost feels like i'm still there.
Post new comment