Schema API

From StatusNet
Jump to: navigation, search


The old

Current system introduced in 0.9: Plugin schema changes


  • have to duplicate specifications between several ways:
    • ColumnDef arrays for creating/update schema
    • column type bitfields for DB_DataObject
    • sequence key for DB_DataObject
    • keys list for DB_DataObject
    • keys list for Memcache_DataObject
    • for core tables, also .sql files for MySQL and Postgres
  • no way to specify compound indexes other than primary keys



  • explicit syntax for multi-column keys
  • some canonical types to map from
  • explicit syntax for specifying foreign keys
  • gives enough info to generate tables and all the DB_DataObject metadata
  • proven support for mysql, postgres, sqlite, and more.

Need to add:

  • enum syntax
  • possibly fiddle with timestamp vs datetime issues...
    • auto-updating timestamps are used a lot
  • couple of freak things where funny-size / ascii stuff is specified?
  • foreign keys
  • booleans



   static function schemaDef()
       return array(
           'fields' => array(
               'uri' => array('type' => 'varchar', 'length' => 255, 'not null' => true),
               'profile_id' => array('type' => 'integer'),
               'group_id' => array('type' => 'integer'),
               'feeduri' => array('type' => 'varchar', 'length' => 255),
               'salmonuri' => array('type' => 'varchar', 'length' => 255),
               'avatar' => array('type' => 'text'),
               'created' => array('type' => 'datetime', 'not null' => true),
               'modified' => array('type' => 'datetime', 'not null' => true),
           'primary key' => array('uri'),
           'unique keys' => array(
               'ostatus_profile_profile_id_idx' => array('profile_id'),
               'ostatus_profile_group_id_idx' => array('group_id'),
               'ostatus_profile_feeduri_idx' => array('feeduri'),
           'foreign keys' => array(
               // Ok think these examples are correct now:
               'ostatus_profile_profile_id_fkey' => array('profile', array('profile_id' => 'id')),
               'ostatus_profile_group_id_fkey' => array('user_group', array('group_id' => 'id')),

Work in progress

What's implemented:

  • preliminary table defs for core
  • dumpschema.php test script to check live schemas
    • ./scripts/schemacheck.php notice # detect live structure of notice table and dump it
    • ./scripts/schemacheck.php --diff notice # detect live structure of notice table and compare it against our core definition
    • ./scripts/schemacheck.php --all # detect live structure of all core tables and dump them
    • ./scripts/schemacheck.php --all -diff # detect live structure of all core tables and compare them against our core definitions
  • schema.php mysqlschema.php pgsqlschema.php
    • Still partially in conversion from the old system
    • Basic detection of MySQL and PostgreSQL table info
      • not all type conversions are correct yet
      • primary, multi-value, and foreign key info not yet retrieved for PG (done 2010-10-11)
      • field comments are not stored or retrieved on postgresql
      • foreign key info not yet retrieved for MySQL (though these are not currently used live, our table defs do list some)
      • auto-update timestamps, some other bits still need to be worked out
    • Auto-conversion of old-style table defs for plugins
      • needs further testing to make sure this actually works
    • Table creation and update logic is incomplete, but table updating should theoretically work on PG once done (old code was unable to update individual fields on PG)
    • 'Update buddy' script to run the updater code but producing an .sql file instead of live changes; this can be helpful at large-scale situations where some sites may need manual running of some code.

Still to do:

  • finish all type conversions/definitions
  • allow for 'enum' types
  • allow consistent way to specify auto-update timestamps
  • ensure primary keys & autoupdates are consistently set
  • see if we can get prefix lengths on prefix indices (eg first 255 bytes of a blob field)
  • do a supported-feature filter/strip step before schema comparison
  • patch up installer/updater to use the schema functions for core tables

Higher-level work:

  • hash the schema layouts and keep a list of last-known-state for each table
    • then checkschema events can be much cheaper in the common case; instead of examining the live tables we can just pull a list of hashes out of a single table. If any don't match or are missing, we can run the full schema comparisons and udpate those tables.
  • make sure checkschema functions currectly in background threads, when switching sites
    • we want cheap checkschemas to be able to run on-demand in queue threads to aid in maintenance.
  • add hook points to provide for specific data-conversion steps after certain actions?
    • eg create this field: then fill it out with this data?
Personal tools