<?xml version="1.0" encoding="UTF-8"?><rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
> <channel><title>Comments on: Database Audit Trails: How Would You Do It?</title> <atom:link href="http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/feed/" rel="self" type="application/rss+xml" /><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/</link> <description>inquisitive: adjective. given to inquiry, research, or asking questions; eager for knowledge; intellectually curious</description> <lastBuildDate>Wed, 08 Feb 2012 11:42:42 +0000</lastBuildDate> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>By: Khansadique123</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-104531</link> <dc:creator>Khansadique123</dc:creator> <pubDate>Tue, 17 Jan 2012 19:47:00 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-104531</guid> <description>Audit Trail (Data Activity Monitoring) is a tool through which we can record all user’s activity on the database, where users are interacting with database directly or through application. It is very useful for those who wish to keep eyes on user activity with data in the database. Also it is helpful for Q.A. (quality assurance team) who is testing application and wish to know data flow before to approve application. This tool is quite easy to manage and view report than other tools. The person who is not DBA can play with this tool easily. This tool is designed for management/administration department to audit data in database.  It is just like a CC camera on database.
visit www.bintobin.com for more detail.</description> <content:encoded><![CDATA[<p>Audit Trail (Data Activity Monitoring) is a tool through which we can record all user’s activity on the database, where users are interacting with database directly or through application. It is very useful for those who wish to keep eyes on user activity with data in the database. Also it is helpful for Q.A. (quality assurance team) who is testing application and wish to know data flow before to approve application. This tool is quite easy to manage and view report than other tools. The person who is not DBA can play with this tool easily. This tool is designed for management/administration department to audit data in database.  It is just like a CC camera on database.<br
/> visit <a
href="http://www.bintobin.com" rel="nofollow">http://www.bintobin.com</a> for more detail.</p> ]]></content:encoded> </item> <item><title>By: Implementing Audit Trail for your application &#171; Niraj Bhatt &#8211; Architect&#039;s Blog</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-102313</link> <dc:creator>Implementing Audit Trail for your application &#171; Niraj Bhatt &#8211; Architect&#039;s Blog</dc:creator> <pubDate>Sat, 05 Nov 2011 17:06:50 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-102313</guid> <description>[...] recently came across Davy&#8217;s post where he talks about implementing auditing. Davy talks about 3 requirements which I guess are quite [...]</description> <content:encoded><![CDATA[<p>[...] recently came across Davy&#8217;s post where he talks about implementing auditing. Davy talks about 3 requirements which I guess are quite [...]</p> ]]></content:encoded> </item> <item><title>By: Brandon Morales</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22922</link> <dc:creator>Brandon Morales</dc:creator> <pubDate>Tue, 10 Nov 2009 21:06:44 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22922</guid> <description>I like to keep the history in each table, and combine that with views and triggers. For each table I want an audit trail I have meta columns {date_created, date_active, date_end, date_replaced}, a primary_key column, and the history_key column. When a new record is added, the date_created, date_active columns receive the current date. The primary key(PK) is a GUID auto generated, and a trigger copies PK to the history key. If a record is updated, we replace the update command in a trigger to keep the old record and just update the date_replaced and date_end columns with the date the record was updated. Then to add a new record, with the updated data, with the previous records history key and the previous records date_created date, but with a new date active. Similar process with deletes, only we don&#039;t populate the date_replaced column only the date_end.We then use a view to only show records that don&#039;t have values in the date_end column so we only show current data, not history information.</description> <content:encoded><![CDATA[<p>I like to keep the history in each table, and combine that with views and triggers. For each table I want an audit trail I have meta columns {date_created, date_active, date_end, date_replaced}, a primary_key column, and the history_key column. When a new record is added, the date_created, date_active columns receive the current date. The primary key(PK) is a GUID auto generated, and a trigger copies PK to the history key. If a record is updated, we replace the update command in a trigger to keep the old record and just update the date_replaced and date_end columns with the date the record was updated. Then to add a new record, with the updated data, with the previous records history key and the previous records date_created date, but with a new date active. Similar process with deletes, only we don&#8217;t populate the date_replaced column only the date_end.</p><p>We then use a view to only show records that don&#8217;t have values in the date_end column so we only show current data, not history information.</p> ]]></content:encoded> </item> <item><title>By: Niraj</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22871</link> <dc:creator>Niraj</dc:creator> <pubDate>Tue, 03 Nov 2009 03:20:12 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22871</guid> <description>Hi Davy,I posted my thoughts &lt;a href=&quot;http://nirajrules.wordpress.com/2009/11/02/implementing-audit-trail-for-your-application&quot; rel=&quot;nofollow&quot;&gt;here&lt;/a&gt;. Let me know your thoughts.Niraj</description> <content:encoded><![CDATA[<p>Hi Davy,</p><p> I posted my thoughts <a
href="http://nirajrules.wordpress.com/2009/11/02/implementing-audit-trail-for-your-application" rel="nofollow">here</a>. Let me know your thoughts.</p><p>Niraj</p> ]]></content:encoded> </item> <item><title>By: Dathan Bennett</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22862</link> <dc:creator>Dathan Bennett</dc:creator> <pubDate>Fri, 30 Oct 2009 20:25:08 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22862</guid> <description>I agree with Dalibor, with a small addition: we keep an audit schema similar to the first option you described, but instead of serializing changes per field, we serialize per record by encapsulating the changes in an XML document.  Then, if using Microsoft SQL Server, querying the audit log can be done from within TSQL.On the other hand (this is the addition to Dalibor&#039;s suggestion), to more efficiently support queries to find when row X was changed, we add a junction table to create a many-to-many relationship between the audit table and the table that stores column metadata (could also be done against the information schema, system views, etc. I guess, though I&#039;ve never done it).  This way, when working within a database that does not support XML querying, you can still efficiently trim your results to only changes that affected the value of the particular column you&#039;re working with.  If you&#039;re working in a scenario where surrogate primary keys of a uniform data type are used, you can add the primary key as a field to your audit table - at that point it&#039;s easy and fast to get a list of all audit entries that changed the value of a particular cell in your database, regardless of your particular RDBMS.  You&#039;ll still have to iterate through the results to find the particular change you&#039;re looking for, but in my experience this is a very efficient and maintainable solution.Small caveat: creating links between audit rows and column metadata rows can make your auditing model somewhat painful when your schema starts to evolve.</description> <content:encoded><![CDATA[<p>I agree with Dalibor, with a small addition: we keep an audit schema similar to the first option you described, but instead of serializing changes per field, we serialize per record by encapsulating the changes in an XML document.  Then, if using Microsoft SQL Server, querying the audit log can be done from within TSQL.</p><p>On the other hand (this is the addition to Dalibor&#8217;s suggestion), to more efficiently support queries to find when row X was changed, we add a junction table to create a many-to-many relationship between the audit table and the table that stores column metadata (could also be done against the information schema, system views, etc. I guess, though I&#8217;ve never done it).  This way, when working within a database that does not support XML querying, you can still efficiently trim your results to only changes that affected the value of the particular column you&#8217;re working with.  If you&#8217;re working in a scenario where surrogate primary keys of a uniform data type are used, you can add the primary key as a field to your audit table &#8211; at that point it&#8217;s easy and fast to get a list of all audit entries that changed the value of a particular cell in your database, regardless of your particular RDBMS.  You&#8217;ll still have to iterate through the results to find the particular change you&#8217;re looking for, but in my experience this is a very efficient and maintainable solution.</p><p>Small caveat: creating links between audit rows and column metadata rows can make your auditing model somewhat painful when your schema starts to evolve.</p> ]]></content:encoded> </item> <item><title>By: Mladen</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22800</link> <dc:creator>Mladen</dc:creator> <pubDate>Fri, 23 Oct 2009 11:44:08 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22800</guid> <description>my favorite way of auditing data is to use async triggers with service broker.
you take the deleted and inserted pseudo tables do FOR XML PATH(&#039;&#039;) on them and send them to a SB queue.
you can have 2 queues: 1 for saving raw data from inserted and deleted tables and one for saving diff-ed data.
this takes care of DML changes.for DDL changs you can use DDL triggers which also give you data in xml format.
with that you can again do the same thing like above.
you can have 2 tables: 1 for DML and 1 for DDL changes.using this async SB pattern you can have you auditing on its own server that is meant just for auditing.
all child servers can all send audit data to this centra auditeing server.i&#039;ve implemented this and it works great. also here&#039;s my article on the subject:
http://www.sqlteam.com/article/centralized-asynchronous-auditing-across-instances-and-servers-with-service-broker</description> <content:encoded><![CDATA[<p>my favorite way of auditing data is to use async triggers with service broker.<br
/> you take the deleted and inserted pseudo tables do FOR XML PATH(&#8221;) on them and send them to a SB queue.<br
/> you can have 2 queues: 1 for saving raw data from inserted and deleted tables and one for saving diff-ed data.<br
/> this takes care of DML changes.</p><p>for DDL changs you can use DDL triggers which also give you data in xml format.<br
/> with that you can again do the same thing like above.<br
/> you can have 2 tables: 1 for DML and 1 for DDL changes.</p><p>using this async SB pattern you can have you auditing on its own server that is meant just for auditing.<br
/> all child servers can all send audit data to this centra auditeing server.</p><p>i&#8217;ve implemented this and it works great. also here&#8217;s my article on the subject:<br
/> <a
href="http://www.sqlteam.com/article/centralized-asynchronous-auditing-across-instances-and-servers-with-service-broker" rel="nofollow">http://www.sqlteam.com/article/centralized-asynchronous-auditing-across-instances-and-servers-with-service-broker</a></p> ]]></content:encoded> </item> <item><title>By: ijrussell</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22785</link> <dc:creator>ijrussell</dc:creator> <pubDate>Wed, 21 Oct 2009 08:39:40 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22785</guid> <description>Have you looked at www.codeplex.com/AutoAudit.  It was written by Paul Nielsen, a Sql MVP who writes the Sql Server Bible books.</description> <content:encoded><![CDATA[<p>Have you looked at <a
href="http://www.codeplex.com/AutoAudit" rel="nofollow">http://www.codeplex.com/AutoAudit</a>.  It was written by Paul Nielsen, a Sql MVP who writes the Sql Server Bible books.</p> ]]></content:encoded> </item> <item><title>By: Think Before Coding</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22769</link> <dc:creator>Think Before Coding</dc:creator> <pubDate>Tue, 20 Oct 2009 16:04:54 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22769</guid> <description>Your needs seems more at business level (know what user did what when ?) than at database level.
If you make a history of your database, you&#039;ll only know the state at a given date, but you won&#039;t be able to know why these changes happened.Follow Jonathan advice, storing domain events as your main storage, then derive RDBMS views from events. You can easily reconstruct any view of your data from you event stream then.</description> <content:encoded><![CDATA[<p>Your needs seems more at business level (know what user did what when ?) than at database level.<br
/> If you make a history of your database, you&#8217;ll only know the state at a given date, but you won&#8217;t be able to know why these changes happened.</p><p>Follow Jonathan advice, storing domain events as your main storage, then derive RDBMS views from events. You can easily reconstruct any view of your data from you event stream then.</p> ]]></content:encoded> </item> <item><title>By: Dalibor Carapic</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22768</link> <dc:creator>Dalibor Carapic</dc:creator> <pubDate>Tue, 20 Oct 2009 12:13:24 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22768</guid> <description>Repost, forgot its html:
If you are using Microsoft SQL I would recommend serializing objects into xml and creating one table which contains all the relevant version data (timestamp, user etc) and one xml column which contains xml representation of the record (example:
&lt;table name=&quot;user&quot;&gt;
&lt;col name=&quot;id&quot;&gt;100&lt;/col&gt;
&lt;col name=&quot;name&quot;&gt;test&lt;/col&gt;
... etc ...
&lt;/table&gt;You get one audit table (which I guess could be filled by some sort of a generic function) with the ability query for the data you would need (via MS SQL xml querying capabilities).If you are using another database which has no xml querying then I would still go with the same principle, but maybe not with Xml format (maybe &#039;column=value&#039;).</description> <content:encoded><![CDATA[<p>Repost, forgot its html:<br
/> If you are using Microsoft SQL I would recommend serializing objects into xml and creating one table which contains all the relevant version data (timestamp, user etc) and one xml column which contains xml representation of the record (example:<br
/> &lt;table name=&#8221;user&#8221;&gt;<br
/> &lt;col name=&#8221;id&#8221;&gt;100&lt;/col&gt;<br
/> &lt;col name=&#8221;name&#8221;&gt;test&lt;/col&gt;<br
/> &#8230; etc &#8230;<br
/> &lt;/table&gt;</p><p>You get one audit table (which I guess could be filled by some sort of a generic function) with the ability query for the data you would need (via MS SQL xml querying capabilities).</p><p>If you are using another database which has no xml querying then I would still go with the same principle, but maybe not with Xml format (maybe &#8216;column=value&#8217;).</p> ]]></content:encoded> </item> <item><title>By: Dalibor Carapic</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22767</link> <dc:creator>Dalibor Carapic</dc:creator> <pubDate>Tue, 20 Oct 2009 12:12:13 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22767</guid> <description>If you are using Microsoft SQL I would recommend serializing objects into xml and creating one table which contains all the relevant version data (timestamp, user etc) and one xml column which contains xml representation of the record (example:100
test
... etc ...You get one audit table (which I guess could be filled by some sort of a generic function) with the ability query for the data you would need (via MS SQL xml querying capabilities).If you are using another database which has no xml querying then I would still go with the same principle, but maybe not with Xml format (maybe &#039;column=value&#039;).</description> <content:encoded><![CDATA[<p>If you are using Microsoft SQL I would recommend serializing objects into xml and creating one table which contains all the relevant version data (timestamp, user etc) and one xml column which contains xml representation of the record (example:</p><p> 100<br
/> test<br
/> &#8230; etc &#8230;</p><p>You get one audit table (which I guess could be filled by some sort of a generic function) with the ability query for the data you would need (via MS SQL xml querying capabilities).</p><p>If you are using another database which has no xml querying then I would still go with the same principle, but maybe not with Xml format (maybe &#8216;column=value&#8217;).</p> ]]></content:encoded> </item> <item><title>By: James L</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22766</link> <dc:creator>James L</dc:creator> <pubDate>Tue, 20 Oct 2009 09:36:02 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22766</guid> <description>Journal the changes to a text file</description> <content:encoded><![CDATA[<p>Journal the changes to a text file</p> ]]></content:encoded> </item> <item><title>By: Jack</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22764</link> <dc:creator>Jack</dc:creator> <pubDate>Tue, 20 Oct 2009 09:20:11 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22764</guid> <description>it is something like sql server&#039;s system tables, sys.tables, sys.column and others.</description> <content:encoded><![CDATA[<p>it is something like sql server&#8217;s system tables, sys.tables, sys.column and others.</p> ]]></content:encoded> </item> <item><title>By: den Ben</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22763</link> <dc:creator>den Ben</dc:creator> <pubDate>Tue, 20 Oct 2009 05:54:40 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22763</guid> <description>@Josh
The first approach in the post uses table- and column metadata for use in the audit trail table.  You can easily query for a specific entity so different archiving policies per entity wouldn&#039;t be a problem.  Don&#039;t see why that would be error prone...</description> <content:encoded><![CDATA[<p>@Josh<br
/> The first approach in the post uses table- and column metadata for use in the audit trail table.  You can easily query for a specific entity so different archiving policies per entity wouldn&#8217;t be a problem.  Don&#8217;t see why that would be error prone&#8230;</p> ]]></content:encoded> </item> <item><title>By: Carel Lotz</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22762</link> <dc:creator>Carel Lotz</dc:creator> <pubDate>Tue, 20 Oct 2009 05:24:48 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22762</guid> <description>If you are using Sql Server 2008, it has support for this out-of-the-box using its Change Data Capture mechanism.  I doesn&#039;t support all your requirements (like tracking the user) but it is interesting to see how they approach and solve the problem.</description> <content:encoded><![CDATA[<p>If you are using Sql Server 2008, it has support for this out-of-the-box using its Change Data Capture mechanism.  I doesn&#8217;t support all your requirements (like tracking the user) but it is interesting to see how they approach and solve the problem.</p> ]]></content:encoded> </item> <item><title>By: josh</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22761</link> <dc:creator>josh</dc:creator> <pubDate>Tue, 20 Oct 2009 05:05:43 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22761</guid> <description>honestly, I would consider archiving policy a part of this question.  I&#039;ve done two auditing approaches, both with single tables.  First, the really simple one just logged a message describing the change and the user.  It was basically good for proving who did what, but not for point in time inspections because it was a string message. Second was actually similar to the first suggestion.  With a single table having columns for timestamp info, user, entity, value before change, and value after change. This was effective for both proving who did what, and point in time inspection even though it took effort to get that.We did encounter an issue when it came to archiving policy. If you we going to say archive everything over 30 days, then it was fine.  However, if you wanted to vary the archive terms by entity it was a problem since the logging for all entities was in a single table.  If you need to vary archival term by entity, you need to have separate auditing tables for each entity or duplicates of each table for auditing.  I suppose you could manually extract audit entries by entity type manually, but that&#039;s error prone.</description> <content:encoded><![CDATA[<p>honestly, I would consider archiving policy a part of this question.  I&#8217;ve done two auditing approaches, both with single tables.  First, the really simple one just logged a message describing the change and the user.  It was basically good for proving who did what, but not for point in time inspections because it was a string message. Second was actually similar to the first suggestion.  With a single table having columns for timestamp info, user, entity, value before change, and value after change. This was effective for both proving who did what, and point in time inspection even though it took effort to get that.</p><p>We did encounter an issue when it came to archiving policy. If you we going to say archive everything over 30 days, then it was fine.  However, if you wanted to vary the archive terms by entity it was a problem since the logging for all entities was in a single table.  If you need to vary archival term by entity, you need to have separate auditing tables for each entity or duplicates of each table for auditing.  I suppose you could manually extract audit entries by entity type manually, but that&#8217;s error prone.</p> ]]></content:encoded> </item> <item><title>By: Jonathan Oliver</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22760</link> <dc:creator>Jonathan Oliver</dc:creator> <pubDate>Tue, 20 Oct 2009 03:45:24 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22760</guid> <description>We kicked around temporal database design for a bit to show how data changed across time.  It quickly made things way to complicated to be practice or even usable.  You will definitely want to look at Greg Young&#039;s patterns on the subject--Command Query Responsibility Separation.  Take a look at Greg&#039;s InfoQ presentation (starting at the 4 minute mark):
http://www.infoq.com/presentations/greg-young-unshackle-qcon08I&#039;ve compiled some resources on his patterns on my blog:
http://jonathan-oliver.blogspot.com/2009/03/dddd-and-cqs-getting-started.html</description> <content:encoded><![CDATA[<p>We kicked around temporal database design for a bit to show how data changed across time.  It quickly made things way to complicated to be practice or even usable.  You will definitely want to look at Greg Young&#8217;s patterns on the subject&#8211;Command Query Responsibility Separation.  Take a look at Greg&#8217;s InfoQ presentation (starting at the 4 minute mark):<br
/> <a
href="http://www.infoq.com/presentations/greg-young-unshackle-qcon08" rel="nofollow">http://www.infoq.com/presentations/greg-young-unshackle-qcon08</a></p><p>I&#8217;ve compiled some resources on his patterns on my blog:<br
/> <a
href="http://jonathan-oliver.blogspot.com/2009/03/dddd-and-cqs-getting-started.html" rel="nofollow">http://jonathan-oliver.blogspot.com/2009/03/dddd-and-cqs-getting-started.html</a></p> ]]></content:encoded> </item> <item><title>By: Phil Haselden</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22759</link> <dc:creator>Phil Haselden</dc:creator> <pubDate>Tue, 20 Oct 2009 01:47:24 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22759</guid> <description>Take a look at Temporal database patterns to see if that meets your needs. I blogged a little about this a while back here http://haselden.spaces.live.com/blog/cns!C7AD1671702F1899!118.entry. That entry contains links to Martin Fowler articles etc on the subject.</description> <content:encoded><![CDATA[<p>Take a look at Temporal database patterns to see if that meets your needs. I blogged a little about this a while back here <a
href="http://haselden.spaces.live.com/blog/cns!C7AD1671702F1899!118.entry" rel="nofollow">http://haselden.spaces.live.com/blog/cns!C7AD1671702F1899!118.entry</a>. That entry contains links to Martin Fowler articles etc on the subject.</p> ]]></content:encoded> </item> <item><title>By: Justin Rudd</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22757</link> <dc:creator>Justin Rudd</dc:creator> <pubDate>Mon, 19 Oct 2009 20:30:14 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22757</guid> <description>My #0.02 - maybe you&#039;ve already had these conversations but...Figure out what &quot;Y&quot; is first.  Figuring out &quot;Y&quot; is much harder than figuring out how to store the data.  You are (probably) thinking about storing data till the end of time.  I once worked on a system that wanted to store 18 months of archive data.  This business touted this as a feature of our platform.  No customer talked to needed more than 30 days.  But I designed for 18 months, fast queries, etc because the business guys couldn&#039;t define a real &quot;Y&quot;.Once &quot;Y&quot; was set at 30 days, scaling it to 60 days or 90 days was trivial because the 30 day solution was so much more simply.With that said, I also got the business guys to give me -
1.) a valid failure rate for archive data (basically on any given day given a normal flow of traffic, how much was I allowed to lose)
2.) an amount of time before the archive data had to be searchable (i.e immediately, 2 hours, a day?).
3.) What is the latency on searching?  Does it have to be as fast as the real system?  Or can it take some time?With those 3 pieces of information, in my case I was able to offload archiving and searching onto a messaging system, and I only saved the delta.  I implemented VCDIFF (http://www.faqs.org/rfcs/rfc3284.html) on the textual representation.  I didn&#039;t do anything fancy with the deltas.  Just kept them in a chain (1 -&gt; 2 -&gt; 3 -&gt; etc).  Had Eric Sink&#039;s article (http://www.ericsink.com/entries/time_space_tradeoffs.html) been around, I probably would have gone with a combination of key frame chains and flower chains.Anyway - the deltas were stored in a BDB in a B-Tree database.  I used a simple consistent hashing algorithm to pick the BDB to store to.  Then I had a cron that ran every 1 hour to stop the message processing system, backup the BDB to a SAN, and restart the messaging system.Search was also done through messaging.  But my search was really easy.  The client would put in their entity IDs, I would issue the query to the BDBs and aggregate the results.  It typically took about 250 to 700 milliseconds (depending on network saturation).</description> <content:encoded><![CDATA[<p>My #0.02 &#8211; maybe you&#8217;ve already had these conversations but&#8230;</p><p>Figure out what &#8220;Y&#8221; is first.  Figuring out &#8220;Y&#8221; is much harder than figuring out how to store the data.  You are (probably) thinking about storing data till the end of time.  I once worked on a system that wanted to store 18 months of archive data.  This business touted this as a feature of our platform.  No customer talked to needed more than 30 days.  But I designed for 18 months, fast queries, etc because the business guys couldn&#8217;t define a real &#8220;Y&#8221;.</p><p>Once &#8220;Y&#8221; was set at 30 days, scaling it to 60 days or 90 days was trivial because the 30 day solution was so much more simply.</p><p>With that said, I also got the business guys to give me -<br
/> 1.) a valid failure rate for archive data (basically on any given day given a normal flow of traffic, how much was I allowed to lose)<br
/> 2.) an amount of time before the archive data had to be searchable (i.e immediately, 2 hours, a day?).<br
/> 3.) What is the latency on searching?  Does it have to be as fast as the real system?  Or can it take some time?</p><p>With those 3 pieces of information, in my case I was able to offload archiving and searching onto a messaging system, and I only saved the delta.  I implemented VCDIFF (<a
href="http://www.faqs.org/rfcs/rfc3284.html" rel="nofollow">http://www.faqs.org/rfcs/rfc3284.html</a>) on the textual representation.  I didn&#8217;t do anything fancy with the deltas.  Just kept them in a chain (1 -&gt; 2 -&gt; 3 -&gt; etc).  Had Eric Sink&#8217;s article (<a
href="http://www.ericsink.com/entries/time_space_tradeoffs.html" rel="nofollow">http://www.ericsink.com/entries/time_space_tradeoffs.html</a>) been around, I probably would have gone with a combination of key frame chains and flower chains.</p><p>Anyway &#8211; the deltas were stored in a BDB in a B-Tree database.  I used a simple consistent hashing algorithm to pick the BDB to store to.  Then I had a cron that ran every 1 hour to stop the message processing system, backup the BDB to a SAN, and restart the messaging system.</p><p>Search was also done through messaging.  But my search was really easy.  The client would put in their entity IDs, I would issue the query to the BDBs and aggregate the results.  It typically took about 250 to 700 milliseconds (depending on network saturation).</p> ]]></content:encoded> </item> <item><title>By: dave-ilsw</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22756</link> <dc:creator>dave-ilsw</dc:creator> <pubDate>Mon, 19 Oct 2009 20:23:16 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22756</guid> <description>We use a single change log table on a project that I work on. We have change log ID, time stamp, user name, table name, set ID, operation, record ID, field name, old value, new value fields.For inserts we just record a single entry. For updates, we record the field name and old and new values. We use a stored procedure to create each change log entry. The stored procedure only records change log entries for fields with a changed value.The set ID makes it easy to group the multiple change log entries that are created when multiple fields change in a single update.We use a Python script to auto-generate the various triggers that are needed for each field in each table that is subject to change logging (there are some tables that don&#039;t get any change logs and we don&#039;t log changes to memo fields, for example).</description> <content:encoded><![CDATA[<p>We use a single change log table on a project that I work on. We have change log ID, time stamp, user name, table name, set ID, operation, record ID, field name, old value, new value fields.</p><p>For inserts we just record a single entry. For updates, we record the field name and old and new values. We use a stored procedure to create each change log entry. The stored procedure only records change log entries for fields with a changed value.</p><p>The set ID makes it easy to group the multiple change log entries that are created when multiple fields change in a single update.</p><p>We use a Python script to auto-generate the various triggers that are needed for each field in each table that is subject to change logging (there are some tables that don&#8217;t get any change logs and we don&#8217;t log changes to memo fields, for example).</p> ]]></content:encoded> </item> <item><title>By: Joseph Daigle</title><link>http://davybrion.com/blog/2009/10/database-audit-trails-how-would-you-do-it/comment-page-1/#comment-22753</link> <dc:creator>Joseph Daigle</dc:creator> <pubDate>Mon, 19 Oct 2009 19:04:06 +0000</pubDate> <guid
isPermaLink="false">http://davybrion.com/blog/?p=1764#comment-22753</guid> <description>ESRI has a &quot;geodatabase&quot; product which I am very familiar with. The spatial part isn&#039;t important, however they have implemented an archiving solution into the database. Internally it uses a your 2nd solution of an audit table for every physical table that is being archived.uerying for the historical version of a particular table is very trivial and fairly easy. However you are correct in that in that is more expensive to query what a single user did. But it&#039;s only N queries where N is the number of tables, which isn&#039;t TOO bad for a single reporting-type query (you could cache or store the results).A third approach could be to combine both ideas. The audit tables are a REALLY GOOD IDEA for doing this sort of archiving. However you could also store in a separate table all of the user operations which were performed and when. This way you can easily query what data a user touched and when. Then you can drill down into the historical data as needed.</description> <content:encoded><![CDATA[<p>ESRI has a &#8220;geodatabase&#8221; product which I am very familiar with. The spatial part isn&#8217;t important, however they have implemented an archiving solution into the database. Internally it uses a your 2nd solution of an audit table for every physical table that is being archived.</p><p>uerying for the historical version of a particular table is very trivial and fairly easy. However you are correct in that in that is more expensive to query what a single user did. But it&#8217;s only N queries where N is the number of tables, which isn&#8217;t TOO bad for a single reporting-type query (you could cache or store the results).</p><p>A third approach could be to combine both ideas. The audit tables are a REALLY GOOD IDEA for doing this sort of archiving. However you could also store in a separate table all of the user operations which were performed and when. This way you can easily query what data a user touched and when. Then you can drill down into the historical data as needed.</p> ]]></content:encoded> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Database Caching 1/24 queries in 0.014 seconds using disk: basic
Object Caching 637/638 objects using disk: basic
Content Delivery Network via Amazon Web Services: CloudFront: d18sni7re4ly7f.cloudfront.net

Served from: davybrion.com @ 2012-02-08 19:17:43 -->
