From nobody at rubyforge.org Sun Feb 4 16:50:08 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Sun, 4 Feb 2007 16:50:08 -0500 (EST) Subject: [Archipelago-submits] [196] trunk/archipelago: added diagram for dump maintenance Message-ID: <20070204215008.E64C15242A68@rubyforge.org> Revision: 196 Author: zond Date: 2007-02-04 16:50:08 -0500 (Sun, 04 Feb 2007) Log Message: ----------- added diagram for dump maintenance Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb Added Paths: ----------- trunk/archipelago/doc/ trunk/archipelago/doc/dump maintenance.dia Added: trunk/archipelago/doc/dump maintenance.dia =================================================================== (Binary files differ) Property changes on: trunk/archipelago/doc/dump maintenance.dia ___________________________________________________________________ Name: svn:mime-type + application/octet-stream Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-01-30 10:16:53 UTC (rev 195) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-04 21:50:08 UTC (rev 196) @@ -40,8 +40,19 @@ # @persistence_provider = options[:persistence_provider] || Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.expand_path(__FILE__)).parent.join("cove_tanker.db")) + # + # The provider of checksumming magic and chunk distribution. + # @officer = options[:officer] || Archipelago::Sanitation::CLEANER + # + # The provider of discovery and service change subscriptions. + # + @captain = options[:captain] || Archipelago::Pirate::BLACKBEARD + + # + # The databases for each owner. + # @dbs = {} # @@ -108,7 +119,6 @@ # a new chunk to the last service in our new relevant successor list. # def initialize_subscriptions - # IMPLEMENT! end # From nobody at rubyforge.org Mon Feb 5 09:43:08 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Mon, 5 Feb 2007 09:43:08 -0500 (EST) Subject: [Archipelago-submits] [197] trunk/archipelago/doc/dump maintenance.dia: moved the check process Message-ID: <20070205144308.84BCF52420C4@rubyforge.org> Revision: 197 Author: zond Date: 2007-02-05 09:43:08 -0500 (Mon, 05 Feb 2007) Log Message: ----------- moved the check process Modified Paths: -------------- trunk/archipelago/doc/dump maintenance.dia Modified: trunk/archipelago/doc/dump maintenance.dia =================================================================== (Binary files differ) From nobody at rubyforge.org Wed Feb 7 13:00:39 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Wed, 7 Feb 2007 13:00:39 -0500 (EST) Subject: [Archipelago-submits] [198] trunk/oneliner/lib/oneliner/superstring.rb: added comments to superstring.rb. Message-ID: <20070207180040.38281524267A@rubyforge.org> Revision: 198 Author: zond Date: 2007-02-07 13:00:38 -0500 (Wed, 07 Feb 2007) Log Message: ----------- added comments to superstring.rb. made superstring cache whether decoding is done. Modified Paths: -------------- trunk/oneliner/lib/oneliner/superstring.rb Modified: trunk/oneliner/lib/oneliner/superstring.rb =================================================================== --- trunk/oneliner/lib/oneliner/superstring.rb 2007-02-05 14:43:08 UTC (rev 197) +++ trunk/oneliner/lib/oneliner/superstring.rb 2007-02-07 18:00:38 UTC (rev 198) @@ -33,6 +33,10 @@ P << (((1 - P[1]) * F) / ((F - 1) * i * (i - 1))) end + # + # Will return a String containing +requested_size+ nr of bytes + # as an online coded chunk of blocks for this SuperString. + # def encode(requested_size) raise "requested size is too small for metadata (8 bytes)" unless requested_size > 7 @@ -54,9 +58,17 @@ return rval + compact(blocks) end + # + # Will try to decode this SuperString using the given +chunk+ and all + # formerly given chunks. + # + # Returns whether decoding is done. + # def decode!(chunk) raise "chunk is too small for metadata (8 bytes)" unless chunk.size > 7 + @decode_done = nil + context = Context.new requested_size = chunk[0..3].unpack("L*").first @@ -96,9 +108,15 @@ end end + # + # Returns whether decoding is done. + # def decode_done - data = @blocks[0... at nr_of_data_blocks] - data.compact.size == data.size + if @decode_done.nil? + data = @blocks[0... at nr_of_data_blocks] + @decode_done = data.compact.size == data.size + end + return @decode_done end private From nobody at rubyforge.org Wed Feb 7 13:01:30 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Wed, 7 Feb 2007 13:01:30 -0500 (EST) Subject: [Archipelago-submits] [199] trunk/oneliner/lib/oneliner/superstring.rb: changed name from decode_done to decode_done? Message-ID: <20070207180130.AC2455242677@rubyforge.org> Revision: 199 Author: zond Date: 2007-02-07 13:01:30 -0500 (Wed, 07 Feb 2007) Log Message: ----------- changed name from decode_done to decode_done? Modified Paths: -------------- trunk/oneliner/lib/oneliner/superstring.rb Modified: trunk/oneliner/lib/oneliner/superstring.rb =================================================================== --- trunk/oneliner/lib/oneliner/superstring.rb 2007-02-07 18:00:38 UTC (rev 198) +++ trunk/oneliner/lib/oneliner/superstring.rb 2007-02-07 18:01:30 UTC (rev 199) @@ -96,7 +96,7 @@ nil while do_decode(context) end - if decode_done + if decode_done? self.replace(compact(@blocks[0... at nr_of_data_blocks])[0...requested_size]) return true else @@ -111,7 +111,7 @@ # # Returns whether decoding is done. # - def decode_done + def decode_done? if @decode_done.nil? data = @blocks[0... at nr_of_data_blocks] @decode_done = data.compact.size == data.size From nobody at rubyforge.org Wed Feb 7 13:14:47 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Wed, 7 Feb 2007 13:14:47 -0500 (EST) Subject: [Archipelago-submits] [200] trunk/oneliner/lib/oneliner/superstring.rb: made decode_done? more clever Message-ID: <20070207181447.658075242691@rubyforge.org> Revision: 200 Author: zond Date: 2007-02-07 13:14:46 -0500 (Wed, 07 Feb 2007) Log Message: ----------- made decode_done? more clever Modified Paths: -------------- trunk/oneliner/lib/oneliner/superstring.rb Modified: trunk/oneliner/lib/oneliner/superstring.rb =================================================================== --- trunk/oneliner/lib/oneliner/superstring.rb 2007-02-07 18:01:30 UTC (rev 199) +++ trunk/oneliner/lib/oneliner/superstring.rb 2007-02-07 18:14:46 UTC (rev 200) @@ -112,7 +112,7 @@ # Returns whether decoding is done. # def decode_done? - if @decode_done.nil? + if @decode_done.nil? && @nr_of_data_blocks data = @blocks[0... at nr_of_data_blocks] @decode_done = data.compact.size == data.size end From nobody at rubyforge.org Wed Feb 7 13:16:42 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Wed, 7 Feb 2007 13:16:42 -0500 (EST) Subject: [Archipelago-submits] [201] trunk/archipelago: changed to a single Btree instead of a set of Hashes for dump persistence. Message-ID: <20070207181642.8B8E9524268A@rubyforge.org> Revision: 201 Author: zond Date: 2007-02-07 13:16:42 -0500 (Wed, 07 Feb 2007) Log Message: ----------- changed to a single Btree instead of a set of Hashes for dump persistence. fixed the tests accordingly. fixed test bugs. Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/hashish.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/dump_test.rb trunk/archipelago/tests/sanitation_test.rb Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-07 18:14:46 UTC (rev 200) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-07 18:16:42 UTC (rev 201) @@ -51,9 +51,9 @@ @captain = options[:captain] || Archipelago::Pirate::BLACKBEARD # - # The databases for each owner. + # The database where the data lives. # - @dbs = {} + @db = @persistence_provider.get_dup_tree("db") # # Use the given options to initialize the publishable @@ -67,32 +67,21 @@ initialize_subscriptions end - def insert!(key, values, owner_id) - db = @dbs[owner_id] ||= @persistence_provider.get_dup_hashish(owner_id) - c = db.dup_count(key) - if c < values.size - (values.size - c).times do - db[key] = values.shift + def insert!(key, values) + @db.env.begin(BDB::TXN_COMMIT, @db) do |txn, db| + db.delete(key) + values.each do |value| + db[key] = value end - elsif c > values.size - db.env.begin(BDB::TXN_COMMIT, db) do - db.delete(key) - values.each do |value| - db[key] = value - end - end end end - def fetch(key, owner_id) - db = @dbs[owner_id] ||= @persistence_provider.get_dup_hashish(owner_id) - return db.duplicates(key) + def fetch(key) + return @db.duplicates(key) end - def delete!(key, owner_id) - db = @dbs[owner_id] ||= @persistence_provider.get_dup_hashish(owner_id) - db.delete(key) - return nil + def delete!(key) + return @db.delete(key) end private Modified: trunk/archipelago/lib/archipelago/hashish.rb =================================================================== --- trunk/archipelago/lib/archipelago/hashish.rb 2007-02-07 18:14:46 UTC (rev 200) +++ trunk/archipelago/lib/archipelago/hashish.rb 2007-02-07 18:16:42 UTC (rev 201) @@ -274,7 +274,7 @@ # Returns something acting like an uncached Berkeley Hash DB instance allowing duplicate entries # and transactions using +name+. # - def get_dup_hashish(name) + def get_dup_tree(name) db = BDB::Hash.open(Pathname.new(File.join(@env.home, name)).expand_path, nil, BDB::CREATE | BDB::NOMMAP, Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-07 18:14:46 UTC (rev 200) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-07 18:16:42 UTC (rev 201) @@ -68,47 +68,55 @@ end def []=(key, value) + t = [Time.now.to_i].pack("I") super_string = Oneliner::SuperString.new(value) nr_of_needed_chunks = @minimum_nr_of_chunks / @minimum_redundancy_ratio chunk_size = (super_string.size / nr_of_needed_chunks) + @metadata_overhead chunk_size = @minimum_recoverable_size / nr_of_needed_chunks if chunk_size < @minimum_recoverable_size / nr_of_needed_chunks - owner_id, dump_hash = responsible_sites(key, @minimum_nr_of_chunks) + dump_hash = responsible_sites(key, @minimum_nr_of_chunks) dump_hash.each do |dump_id, nr_of_chunks_needed| @sites[dump_id][:service].insert!(key, (0...nr_of_chunks_needed).collect do |nr_of_chunks_needed| - super_string.encode(chunk_size) - end, - owner_id) + t + super_string.encode(chunk_size) + end) end end def delete!(key) - owner_id, dump_hash = responsible_sites(key, @minimum_nr_of_chunks) + dump_hash = responsible_sites(key, @minimum_nr_of_chunks) dump_hash.each do |dump_id, nr_of_chunks_available| - @sites[dump_id][:service].delete!(key, - owner_id) + @sites[dump_id][:service].delete!(key) end end def [](key) - owner_id, dump_hash = responsible_sites(key, @minimum_nr_of_chunks) + dump_hash = responsible_sites(key, @minimum_nr_of_chunks) dump_ids = dump_hash.keys + newest_timestamp = "\000\000\000\000" rval = Oneliner::SuperString.new - decoded = false - found_chunks = false - while !decoded && dump_ids.size > 0 - @sites[dump_ids.shift][:service].fetch(key, owner_id).each do |chunk| - found_chunks = true - decoded = decoded || rval.decode!(chunk) + while !rval.decode_done? && dump_ids.size > 0 + chunks = @sites[dump_ids.shift][:service].fetch(key) + while !rval.decode_done? && chunks.size > 0 + chunk = chunks.shift + t = chunk[0...4] + data = chunk[4..-1] + if t > newest_timestamp + rval = Oneliner::SuperString.new + newest_timestamp = t + end + + if t == newest_timestamp + rval.decode!(data) + end end end - if decoded + if rval.decode_done? return rval.to_s else - raise NotEnoughDataException.new(self, key) if found_chunks + raise NotEnoughDataException.new(self, key) if newest_timestamp != "\000\000\000\000" return nil end @@ -119,8 +127,9 @@ end # - # Returns [first_greater_successor, {service_id => nr_of_chunks_it_should_have} with size +n+] - # from @sites having id > +key+. + # Returns {service_id => nr_of_chunks_it_should_have} + # where sum(nr_of_chunks_it_should_have) == +n+ + # from @sites having service_id > +key+. # # Will loop to the beginning if the number of elements run out. # @@ -128,13 +137,11 @@ raise NoRemoteDatabaseAvailableException.new(self) if @sites.empty? rval = {} - owner_id = nil get_least_greater_than(@sites, key, n).each do |id| rval[id] ||= 0 rval[id] += 1 - owner_id ||= id end - return [owner_id, rval] + return rval end private Modified: trunk/archipelago/tests/dump_test.rb =================================================================== --- trunk/archipelago/tests/dump_test.rb 2007-02-07 18:14:46 UTC (rev 200) +++ trunk/archipelago/tests/dump_test.rb 2007-02-07 18:16:42 UTC (rev 201) @@ -1,4 +1,7 @@ +MC_ENABLED = true +BLACKBEARD_ENABLED = true + require File.join(File.dirname(__FILE__), 'test_helper') class DumpTest < Test::Unit::TestCase @@ -14,32 +17,24 @@ end def test_insert_fetch - @d.insert!("key", ["value", "value1"], "owner") - @d.insert!("key2", ["value2"], "owner") - @d.insert!("key2", ["value3"], "owner2") + @d.insert!("key", ["value", "value1"]) + @d.insert!("key2", ["value2"]) + @d.insert!("key2", ["value3"]) assert_equal(["value", "value1"].sort, - @d.fetch("key", "owner").sort) - assert_equal(["value2"], - @d.fetch("key2", "owner")) - assert_equal(["value3"], - @d.fetch("key2", "owner2")) + @d.fetch("key").sort) + assert_equal(["value3"].sort, + @d.fetch("key2").sort) - @d.insert!("key", ["value2", "value3"], "owner") - assert_equal(["value", "value1"].sort, - @d.fetch("key", "owner").sort) - @d.insert!("key", ["value2"], "owner") - assert_equal(["value2"], - @d.fetch("key", "owner")) - @d.insert!("key", ["value4", "value5"], "owner") - assert_equal(["value2", "value4"].sort, - @d.fetch("key", "owner").sort) + @d.insert!("key", ["value2", "value3"]) + assert_equal(["value2", "value3"].sort, + @d.fetch("key").sort) - @d.delete!("key", "owner") - @d.delete!("key2", "owner2") + @d.delete!("key") + @d.delete!("key2") assert_equal([], - @d.fetch("key", "owner")) + @d.fetch("key")) assert_equal([], - @d.fetch("key2", "owner2")) + @d.fetch("key2")) end end Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-07 18:14:46 UTC (rev 200) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-07 18:16:42 UTC (rev 201) @@ -1,4 +1,7 @@ +MC_ENABLED = true +BLACKBEARD_ENABLED = true + require File.join(File.dirname(__FILE__), 'test_helper') class SanitationTest < Test::Unit::TestCase From nobody at rubyforge.org Thu Feb 8 08:33:55 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 8 Feb 2007 08:33:55 -0500 (EST) Subject: [Archipelago-submits] [202] trunk/archipelago: fixed working tests for sanitation. Message-ID: <20070208133355.D15FF524252E@rubyforge.org> Revision: 202 Author: zond Date: 2007-02-08 08:33:55 -0500 (Thu, 08 Feb 2007) Log Message: ----------- fixed working tests for sanitation. added messages to asserts. made all clients use disco intialization from disco.rb. changed around how publishables get db and service_id and service_description. made some stop! methods more clever. started on the maintenance of dumps. Modified Paths: -------------- trunk/archipelago/lib/archipelago/client.rb trunk/archipelago/lib/archipelago/disco.rb trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/hashish.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/lib/archipelago/tranny.rb trunk/archipelago/tests/disco_test.rb trunk/archipelago/tests/dump_test.rb trunk/archipelago/tests/sanitation_test.rb trunk/archipelago/tests/test_helper.rb Modified: trunk/archipelago/lib/archipelago/client.rb =================================================================== --- trunk/archipelago/lib/archipelago/client.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/lib/archipelago/client.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -25,6 +25,7 @@ MAXIMUM_SERVICE_UPDATE_INTERVAL = 60 class Base + include Archipelago::Disco::Camel attr_reader :jockey # # Initialize an instance using Archipelago::Disco::MC or :jockey if given, @@ -40,13 +41,8 @@ # Sets up this instance with the given +options+. # def setup(options = {}) - @jockey.stop! if defined?(@jockey) && @jockey != Archipelago::Disco::MC - if defined?(Archipelago::Disco::MC) - @jockey = options[:jockey] || Archipelago::Disco::MC - else - @jockey = options[:jockey] || Archipelago::Disco::Jockey.new - end - + setup_jockey(options) + @initial_service_update_interval = options[:initial_service_update_interval] || INITIAL_SERVICE_UPDATE_INTERVAL @maximum_service_update_interval = options[:maximum_service_update_interval] || MAXIMUM_SERVICE_UPDATE_INTERVAL end Modified: trunk/archipelago/lib/archipelago/disco.rb =================================================================== --- trunk/archipelago/lib/archipelago/disco.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -72,9 +72,47 @@ # The host we are running on. # HOST = "#{Socket::gethostbyname(Socket::gethostname)[0]}" rescue "localhost" + + # + # Anything that has a @jockey that is an Archipelago::Disco::Jockey + # can include this for simplicity. + # + module Camel + # + # Setup our @jockey as a Archipelago::Disco::Jockey with given options. + # + # It will first stop any @jockey we currently have that is NOT the global Archipelago::Disco::MC. + # + # If +jockey_options+ or +jockey+ are given it will always use a new or given Archipelago::Disco::Jockey, + # otherwise it will try to use the global Archipelago::Disco::Jockey instead. + # + def setup_jockey(options = {}) + @jockey.stop! if defined?(@jockey) && @jockey != Archipelago::Disco::MC + + @jockey_options ||= {} + jockey_options = (@jockey_options || {}).merge(options[:jockey_options] || {}) + + if options[:jockey] + @jockey = options[:jockey] + unless jockey_options.empty? + @jockey.setup(jockey_options) + end + else + unless jockey_options.empty? + @jockey = Archipelago::Disco::Jockey.new(jockey_options) + else + if defined?(Archipelago::Disco::MC) + @jockey = Archipelago::Disco::MC + else + @jockey = Archipelago::Disco::Jockey.new + end + end + end + end + end # - # A module to simplify publishing services. + # A module to simplify publishing of services. # # If you include it you can use the publish! method # at your convenience. @@ -89,7 +127,6 @@ # define @persistence_provider before you call initialize_publishable. # module Publishable - # # Also add the ClassMethods to +base+. # @@ -120,12 +157,31 @@ # :jockey_options. # def initialize_publishable(options = {}) + # + # The provider of happy magic persistent hashes of different kinds. + # + @persistence_provider ||= options[:persistence_provider] || Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.expand_path(__FILE__)).parent.join(self.class.name + ".db")) + # + # Stuff that didnt fit in any of the other databases. + # + @metadata ||= @persistence_provider.get_hashish("metadata") + # + # Our service_description that is supposed to define and describe + # us in the discovery network. + # @service_description = { - :service_id => service_id, + :service_id => Digest::SHA1.hexdigest("#{HOST}:#{Time.new.to_f}:#{self.object_id}:#{rand(1 << 32)}").to_s, :validator => self, :service => self, :class => self.class.name }.merge(options[:service_description] || {}) + # + # Our service_id that is supposed to be unique and persistent. + # + @metadata["service_id"] = @service_description[:service_id] + # + # Setup our Archipelago::Disco::Jockey. + # @jockey_options = options[:jockey_options] || {} end @@ -163,31 +219,35 @@ end # + # Override this if you want to do something magical before or after you + # get stopped. + # + def around_stop(&block) + yield + end + + # # Stops the publishing of this Publishable. # def stop! - if valid? - @valid = false - if defined?(Archipelago::Disco::MC) && @jockey == Archipelago::Disco::MC - @jockey.unpublish(self.service_id) - else - @jockey.stop! + around_stop do + if defined?(@jockey) + if valid? + @valid = false + if defined?(Archipelago::Disco::MC) && @jockey == Archipelago::Disco::MC + @jockey.unpublish(self.service_id) + else + @jockey.stop! + end + end end end end - + # # Returns our semi-unique id so that we can be found again. # def service_id - # - # The provider of happy magic persistent hashes of different kinds. - # - @persistence_provider ||= Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.expand_path(__FILE__)).parent.join(self.class.name + ".db")) - # - # Stuff that didnt fit in any of the other databases. - # - @metadata ||= @persistence_provider.get_hashish("metadata") return @metadata["service_id"] ||= Digest::SHA1.hexdigest("#{HOST}:#{Time.new.to_f}:#{self.object_id}:#{rand(1 << 32)}").to_s end @@ -291,7 +351,7 @@ attr_reader :hash include Archipelago::Current::Synchronized include Archipelago::Current::ThreadedCollection - def_delegators :@hash, :[], :each, :empty?, :delete, :values, :keys, :include? + def_delegators :@hash, :[], :size, :each, :empty?, :delete, :values, :keys, :include? def initialize(options = {}) super @hash = options[:hash] || {} @@ -309,7 +369,7 @@ # def delete(key) value = @hash[key] - @jockey.instance_eval do notify_subscribers(:lost, value) end + @jockey.instance_eval do notify_subscribers(:lost, value) end if value @hash.delete(key) end # @@ -386,14 +446,23 @@ @new_service_semaphore = MonitorMixin::ConditionVariable.new(Archipelago::Current::Lock.new) @service_change_subscribers_by_event_type = {:found => {}, :lost => {}} + + @validation_interval = options[:validation_interval] || VALIDATION_INTERVAL setup(options) + + start! + end + # + # Start all our threads. + # + def start!(options = {}) start_listener start_unilistener start_shouter start_picker - start_validator(options[:validation_interval] || VALIDATION_INTERVAL) + start_validator(options[:validation_interval] || @validation_interval) end # @@ -473,7 +542,7 @@ end # - # Stops all the threads in this instance. + # Stops all the threads and close all sockets in this instance. # def stop! if @valid @@ -481,14 +550,24 @@ @local_services.each do |service_id, service_description| self.unpublish(service_id) end + @listener_thread.kill @unilistener_thread.kill - @validator_thread.kill + until @incoming.empty? + sleep(0.01) + end + @listener.close + @unilistener.close @picker_thread.kill + until @outgoing.empty? sleep(0.01) end @shouter_thread.kill + @sender.close + @unisender.close + + @validator_thread.kill end end Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -43,14 +43,9 @@ # # The provider of checksumming magic and chunk distribution. # - @officer = options[:officer] || Archipelago::Sanitation::CLEANER + @officer = options[:officer] || (defined?(Archipelago::Sanitation::CLEANER) || Archipelago::Sanitation::Officer.new) # - # The provider of discovery and service change subscriptions. - # - @captain = options[:captain] || Archipelago::Pirate::BLACKBEARD - - # # The database where the data lives. # @db = @persistence_provider.get_dup_tree("db") @@ -64,9 +59,20 @@ def around_publish(&publish_block) yield - initialize_subscriptions + @officer.subscribe(:found, @officer.site_description) do |record| + found_peer(record) + end + @officer.subscribe(:lost, @officer.site_description) do |record| + lost_peer(record) + end end + def around_stop(&block) + @officer.unsubscribe(:found, @officer.site_description) + @officer.unsubscribe(:lost, @officer.site_description) + yield + end + def insert!(key, values) @db.env.begin(BDB::TXN_COMMIT, @db) do |txn, db| db.delete(key) @@ -86,30 +92,24 @@ private - # - # Start subscribing to other Archipelago::Dump::Site instances - # appearing or disappearing. - # - # * A service starting up: - # * A service starting up right before us will get ownership of all our keys: - # * All keys belonging to us in our and our relevant successor list must - # change owner to the new service. - # * The last relevant successor having a chunk of that key must drop it. - # * A service starting up within our relevant successor list will get a chunk - # from all our keys, and the last one in the relevant successor list will - # have to drop it. - # * A service disappearing: - # * A service disappearing right before us will make us the new master for all - # its keys: - # * All keys belonging to them in our and our relevant successor list must - # change owner to us. - # * A new chunk must be sent to the last relevant successor. - # * A service disappering from our relevant successor list will make us send - # a new chunk to the last service in our new relevant successor list. - # - def initialize_subscriptions + def master_to?(other_id) + @officer.master_to?(service_id, other_id) end + def slave_to?(other_id) + @officer.master_to?(other_id, service_id) + end + + def found_peer(record) + if master_to?(record[:service_id]) + end + if slave_to?(record[:service_id]) + end + end + + def lost_peer(record) + end + # # Ensures that all the dumps responsible for +key+ # has chunks for that key. Modified: trunk/archipelago/lib/archipelago/hashish.rb =================================================================== --- trunk/archipelago/lib/archipelago/hashish.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/lib/archipelago/hashish.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -310,7 +310,7 @@ close! home = Pathname.new(@env.home).expand_path @env.close - home.rmtree if home.exist? + home.rmtree if home.exist? end end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -16,18 +16,38 @@ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. require 'archipelago/client' +require 'forwardable' module Archipelago module Sanitation + # + # Description of the Dump::Sites that can + # store our data. + # SITE_DESCRIPTION = { :class => 'Archipelago::Dump::Site' } + # + # The minimum size of data that is reasonable to + # recover without too much overhead. + # MINIMUM_RECOVERABLE_SIZE = 128 + # + # The minimum number of chunks we want to spread out + # to ensure us against server failure. + # MINIMUM_NR_OF_CHUNKS = 14 + # + # The minimum ratio of redundancy (used diskspace / data size) + # we want to use. + # MINIMUM_REDUNDANCY_RATIO = 2 + # + # The extra bytes used by metadata in all check block chunks. + # METADATA_OVERHEAD = 8 # @@ -51,7 +71,9 @@ end class Officer < Archipelago::Client::Base - attr_reader :sites + extend Forwardable + attr_reader :sites, :site_description + def_delegators :@jockey, :subscribe, :unsubscribe def initialize(options = {}) super(options) @@ -64,7 +86,7 @@ @minimum_nr_of_chunks = options[:minimum_nr_of_chunks] || MINIMUM_NR_OF_CHUNKS @minimum_redundancy_ratio = options[:minimum_redundancy_ratio] || MINIMUM_REDUNDANCY_RATIO @metadata_overhead = options[:metadata_overhead] || METADATA_OVERHEAD - @site_description = SITE_DESCRIPTION.merge(options[:site_description] || {}) + @site_description = Archipelago::Disco::Query.new(SITE_DESCRIPTION.merge(options[:site_description] || {})) end def []=(key, value) @@ -123,7 +145,7 @@ end def update_services! - @sites = @jockey.lookup(Archipelago::Disco::Query.new(@site_description), 0) + @sites = @jockey.lookup(site_description, 0) end # @@ -144,6 +166,13 @@ return rval end + # + # Returns whether +potential_master_id+ is supposed to be a master to +potential_slave_id+. + # + def master_to?(potential_master_id, potential_slave_id) + get_least_greater_than(@sites, potential_master_id, @minimum_nr_of_chunks - 1).include?(potential_slave_id) + end + private # Modified: trunk/archipelago/lib/archipelago/tranny.rb =================================================================== --- trunk/archipelago/lib/archipelago/tranny.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/lib/archipelago/tranny.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -107,8 +107,6 @@ # Will use Archipelago::Disco::Publishable by calling initialize_publishable with +options+. # def initialize(options = {}) - @persistence_provider = options[:persistence_provider] || Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.expand_path(__FILE__)).parent.join("tranny_manager.db")) - initialize_publishable(options) @transaction_timeout = options[:transaction_timeout] || TRANSACTION_TIMEOUT Modified: trunk/archipelago/tests/disco_test.rb =================================================================== --- trunk/archipelago/tests/disco_test.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/tests/disco_test.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -140,7 +140,7 @@ :validator => Archipelago::Disco::MockValidator.new, :service_id => 33)) sleep 0.1 - assert(@ltq.empty?) + assert(@ltq.empty?, "we got messages while we shouldnt: #{@ltq.inspect}") c1 = Archipelago::Disco::Jockey.new(:thrifty_publishing => true) assert(!c1.lookup(Archipelago::Disco::Query.new(:glad => "ja")).empty?) Modified: trunk/archipelago/tests/dump_test.rb =================================================================== --- trunk/archipelago/tests/dump_test.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/tests/dump_test.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -1,7 +1,4 @@ -MC_ENABLED = true -BLACKBEARD_ENABLED = true - require File.join(File.dirname(__FILE__), 'test_helper') class DumpTest < Test::Unit::TestCase Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -1,18 +1,20 @@ -MC_ENABLED = true -BLACKBEARD_ENABLED = true - require File.join(File.dirname(__FILE__), 'test_helper') class SanitationTest < Test::Unit::TestCase + def test_truth + assert(true) + end + def setup DRb.start_service @d = Archipelago::Dump::Site.new(:persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(__FILE__).parent.join("site.db"))) @d.publish! + @c = Archipelago::Sanitation::Officer.new assert_within(10) do - Archipelago::Sanitation::CLEANER.update_services! - s = Archipelago::Sanitation::CLEANER.instance_eval do @sites end + @c.update_services! + s = @c.instance_eval do @sites end s.keys == [@d.service_id] end end @@ -23,11 +25,43 @@ DRb.stop_service end + def test_master_to + @d.stop! + dumps = [] + cleaner2 = Archipelago::Sanitation::Officer.new(:minimum_nr_of_chunks => 3) + begin + 10.times do |n| + dumps[n] = Archipelago::Dump::Site.new(:service_description => {:service_id => n.to_s}, + :persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(__FILE__).parent.join("master_test_#{n}"))) + dumps[n].publish! + end + assert_within(10) do + cleaner2.update_services! + s = cleaner2.instance_eval do @sites end + dumps.all? do |dump| + s.include?(dump.service_id) + end + end + 10.times do |pot_mast| + 10.times do |pot_slave| + if (d = pot_slave - pot_mast) > 0 && d < 3 + assert(cleaner2.master_to?(dumps[pot_mast].service_id, dumps[pot_slave].service_id), "master_to?(#{dumps[pot_mast].service_id}, #{dumps[pot_slave].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") + end + end + end + ensure + dumps.each do |dump| + dump.stop! + dump.instance_eval do @persistence_provider.unlink! end + end + end + end + def test_get_set - Archipelago::Sanitation::CLEANER["hej"] = "hoho" - assert_equal(Archipelago::Sanitation::CLEANER["hej"], "hoho") - Archipelago::Sanitation::CLEANER.delete!("hej") - assert_equal(Archipelago::Sanitation::CLEANER["hej"], nil) + @c["hej"] = "hoho" + assert_equal(@c["hej"], "hoho") + @c.delete!("hej") + assert_equal(@c["hej"], nil) end end Modified: trunk/archipelago/tests/test_helper.rb =================================================================== --- trunk/archipelago/tests/test_helper.rb 2007-02-07 18:16:42 UTC (rev 201) +++ trunk/archipelago/tests/test_helper.rb 2007-02-08 13:33:55 UTC (rev 202) @@ -4,6 +4,7 @@ MC_DISABLED = true unless defined?(MC_ENABLED) && MC_ENABLED BLACKBEARD_DISABLED = true unless defined?(BLACKBEARD_ENABLED) && BLACKBEARD_ENABLED +CLEANER_DISABLED = true unless defined?(CLEANER_ENABLED) && CLEANER_ENABLED require 'pp' require 'drb' From nobody at rubyforge.org Thu Feb 8 08:35:48 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 8 Feb 2007 08:35:48 -0500 (EST) Subject: [Archipelago-submits] [203] trunk/archipelago/lib/archipelago/treasure.rb: made treasure chest use persistence provider from disco.rb as well Message-ID: <20070208133548.ACD4B5242529@rubyforge.org> Revision: 203 Author: zond Date: 2007-02-08 08:35:48 -0500 (Thu, 08 Feb 2007) Log Message: ----------- made treasure chest use persistence provider from disco.rb as well Modified Paths: -------------- trunk/archipelago/lib/archipelago/treasure.rb Modified: trunk/archipelago/lib/archipelago/treasure.rb =================================================================== --- trunk/archipelago/lib/archipelago/treasure.rb 2007-02-08 13:33:55 UTC (rev 202) +++ trunk/archipelago/lib/archipelago/treasure.rb 2007-02-08 13:35:48 UTC (rev 203) @@ -254,11 +254,6 @@ # def initialize(options = {}) # - # The provider of happy magic persistent hashes of different kinds. - # - @persistence_provider = options[:persistence_provider] || Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.expand_path(__FILE__)).parent.join("treasure_chest.db")) - - # # Use the given options to initialize the publishable # instance variables. # From nobody at rubyforge.org Thu Feb 8 08:52:14 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 8 Feb 2007 08:52:14 -0500 (EST) Subject: [Archipelago-submits] [204] trunk/archipelago/lib/archipelago/disco.rb: small fix in disco.rb Message-ID: <20070208135214.7DB575242365@rubyforge.org> Revision: 204 Author: zond Date: 2007-02-08 08:52:13 -0500 (Thu, 08 Feb 2007) Log Message: ----------- small fix in disco.rb Modified Paths: -------------- trunk/archipelago/lib/archipelago/disco.rb Modified: trunk/archipelago/lib/archipelago/disco.rb =================================================================== --- trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 13:35:48 UTC (rev 203) +++ trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 13:52:13 UTC (rev 204) @@ -87,7 +87,7 @@ # otherwise it will try to use the global Archipelago::Disco::Jockey instead. # def setup_jockey(options = {}) - @jockey.stop! if defined?(@jockey) && @jockey != Archipelago::Disco::MC + @jockey.stop! if defined?(@jockey) && (!defined?(Archipelago::Disco::MC) || @jockey != Archipelago::Disco::MC) @jockey_options ||= {} jockey_options = (@jockey_options || {}).merge(options[:jockey_options] || {}) From nobody at rubyforge.org Thu Feb 8 09:00:34 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 8 Feb 2007 09:00:34 -0500 (EST) Subject: [Archipelago-submits] [205] trunk/archipelago/lib/archipelago/disco.rb: made publishables use disco camel as well Message-ID: <20070208140035.02ECA5242526@rubyforge.org> Revision: 205 Author: zond Date: 2007-02-08 09:00:34 -0500 (Thu, 08 Feb 2007) Log Message: ----------- made publishables use disco camel as well Modified Paths: -------------- trunk/archipelago/lib/archipelago/disco.rb Modified: trunk/archipelago/lib/archipelago/disco.rb =================================================================== --- trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 13:52:13 UTC (rev 204) +++ trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 14:00:34 UTC (rev 205) @@ -127,6 +127,7 @@ # define @persistence_provider before you call initialize_publishable. # module Publishable + include Camel # # Also add the ClassMethods to +base+. # @@ -202,7 +203,7 @@ # def publish!(options = {}) around_publish do - @jockey ||= defined?(Archipelago::Disco::MC) ? Archipelago::Disco::MC : Archipelago::Disco::Jockey.new(@jockey_options.merge(options[:jockey_options] || {})) + setup_jockey(options) @jockey.publish(Archipelago::Disco::Record.new(@service_description.merge(options[:service_description] || {}))) end end From nobody at rubyforge.org Thu Feb 8 09:35:16 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 8 Feb 2007 09:35:16 -0500 (EST) Subject: [Archipelago-submits] [206] trunk/archipelago/lib/archipelago/sanitation.rb: added comments to sanitation.rb Message-ID: <20070208143516.532ED5242526@rubyforge.org> Revision: 206 Author: zond Date: 2007-02-08 09:35:15 -0500 (Thu, 08 Feb 2007) Log Message: ----------- added comments to sanitation.rb Modified Paths: -------------- trunk/archipelago/lib/archipelago/sanitation.rb Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-08 14:00:34 UTC (rev 205) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-08 14:35:15 UTC (rev 206) @@ -70,6 +70,13 @@ end end + # + # The client class for the redundant Archipelago::Dump network. + # + # Keeps track of our sites and writes and reads data. + # + # Also keeps track of all the redundancy work needed. + # class Officer < Archipelago::Client::Base extend Forwardable attr_reader :sites, :site_description @@ -89,6 +96,9 @@ @site_description = Archipelago::Disco::Query.new(SITE_DESCRIPTION.merge(options[:site_description] || {})) end + # + # Write +key+ and +value+ into the site network with a good level of redundancy etc. + # def []=(key, value) t = [Time.now.to_i].pack("I") super_string = Oneliner::SuperString.new(value) @@ -112,6 +122,9 @@ end end + # + # Get the data for +key+ in the site network. + # def [](key) dump_hash = responsible_sites(key, @minimum_nr_of_chunks) dump_ids = dump_hash.keys @@ -144,6 +157,9 @@ end + # + # Updates our cache of the available sites using our jockey. + # def update_services! @sites = @jockey.lookup(site_description, 0) end From nobody at rubyforge.org Thu Feb 8 11:56:31 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 8 Feb 2007 11:56:31 -0500 (EST) Subject: [Archipelago-submits] [207] trunk/archipelago: added more sanitation tests and methods. Message-ID: <20070208165631.633695242422@rubyforge.org> Revision: 207 Author: zond Date: 2007-02-08 11:56:30 -0500 (Thu, 08 Feb 2007) Log Message: ----------- added more sanitation tests and methods. improved disco service_id setup for publishables. Modified Paths: -------------- trunk/archipelago/doc/dump maintenance.dia trunk/archipelago/lib/archipelago/disco.rb trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/sanitation_test.rb Modified: trunk/archipelago/doc/dump maintenance.dia =================================================================== (Binary files differ) Modified: trunk/archipelago/lib/archipelago/disco.rb =================================================================== --- trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 14:35:15 UTC (rev 206) +++ trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 16:56:30 UTC (rev 207) @@ -171,7 +171,7 @@ # us in the discovery network. # @service_description = { - :service_id => Digest::SHA1.hexdigest("#{HOST}:#{Time.new.to_f}:#{self.object_id}:#{rand(1 << 32)}").to_s, + :service_id => service_id || Digest::SHA1.hexdigest("#{HOST}:#{Time.new.to_f}:#{self.object_id}:#{rand(1 << 32)}").to_s, :validator => self, :service => self, :class => self.class.name @@ -249,7 +249,7 @@ # Returns our semi-unique id so that we can be found again. # def service_id - return @metadata["service_id"] ||= Digest::SHA1.hexdigest("#{HOST}:#{Time.new.to_f}:#{self.object_id}:#{rand(1 << 32)}").to_s + return @metadata["service_id"] end end Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-08 14:35:15 UTC (rev 206) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-08 16:56:30 UTC (rev 207) @@ -27,6 +27,9 @@ module Dump + # + # The server class in the Archipipelago::Dump network. + # class Site # @@ -90,8 +93,6 @@ return @db.delete(key) end - private - def master_to?(other_id) @officer.master_to?(service_id, other_id) end @@ -100,8 +101,21 @@ @officer.master_to?(other_id, service_id) end + def last_slave + @officer.last_slave_to(service_id) + end + + def next + @officer.next_to(service_id) + end + + private + def found_peer(record) - if master_to?(record[:service_id]) + Thread.new do + if master_to?(record[:service_id]) + + end end if slave_to?(record[:service_id]) end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-08 14:35:15 UTC (rev 206) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-08 16:56:30 UTC (rev 207) @@ -189,6 +189,20 @@ get_least_greater_than(@sites, potential_master_id, @minimum_nr_of_chunks - 1).include?(potential_slave_id) end + # + # Returns the last slave to +master_id+. + # + def last_slave_to(master_id) + @sites[get_least_greater_than(@sites, master_id, @minimum_nr_of_chunks - 1).last][:service] + end + + # + # Returns the next site after +dump_id+. + # + def next_to(dump_id) + @sites[get_least_greater_than(@sites, dump_id, 1).last][:service] + end + private # Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-08 14:35:15 UTC (rev 206) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-08 16:56:30 UTC (rev 207) @@ -25,14 +25,15 @@ DRb.stop_service end - def test_master_to + def test_navigation @d.stop! dumps = [] cleaner2 = Archipelago::Sanitation::Officer.new(:minimum_nr_of_chunks => 3) begin 10.times do |n| - dumps[n] = Archipelago::Dump::Site.new(:service_description => {:service_id => n.to_s}, - :persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(__FILE__).parent.join("master_test_#{n}"))) + dumps[n] = Archipelago::Dump::Site.new(:officer => cleaner2, + :service_description => {:service_id => n.to_s}, + :persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(__FILE__).parent.join("master_test_#{n}.db"))) dumps[n].publish! end assert_within(10) do @@ -45,10 +46,27 @@ 10.times do |pot_mast| 10.times do |pot_slave| if (d = pot_slave - pot_mast) > 0 && d < 3 - assert(cleaner2.master_to?(dumps[pot_mast].service_id, dumps[pot_slave].service_id), "master_to?(#{dumps[pot_mast].service_id}, #{dumps[pot_slave].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") + assert(dumps[pot_mast].master_to?(dumps[pot_slave].service_id), + "#{dumps[pot_mast].service_id}.master_to?(#{dumps[pot_slave].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") + assert(dumps[pot_slave].slave_to?(dumps[pot_mast].service_id), + "#{dumps[pot_slave].service_id}.slave_to?(#{dumps[pot_mast].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") end + if (d = (dumps.size + pot_slave) - pot_mast) > 0 && d < 3 + assert(dumps[pot_mast].master_to?(dumps[pot_slave].service_id), + "#{dumps[pot_mast].service_id}.master_to?(#{dumps[pot_slave].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") + assert(dumps[pot_slave].slave_to?(dumps[pot_mast].service_id), + "#{dumps[pot_slave].service_id}.slave_to?(#{dumps[pot_mast].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") + end end end + 9.times do |site_id| + assert_equal(dumps[site_id + 1].service_id, dumps[site_id].next.service_id) + end + assert_equal(dumps.first.service_id, dumps.last.next.service_id) + dumps.each do |dump| + assert_equal(dump.next.next.service_id, + dump.last_slave.service_id) + end ensure dumps.each do |dump| dump.stop! From nobody at rubyforge.org Fri Feb 9 05:43:04 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Fri, 9 Feb 2007 05:43:04 -0500 (EST) Subject: [Archipelago-submits] [208] trunk/archipelago: made subscriptions more dynamic by allowing more subscriptions with different ids Message-ID: <20070209104304.330325240A98@rubyforge.org> Revision: 208 Author: zond Date: 2007-02-09 05:43:03 -0500 (Fri, 09 Feb 2007) Log Message: ----------- made subscriptions more dynamic by allowing more subscriptions with different ids Modified Paths: -------------- trunk/archipelago/lib/archipelago/disco.rb trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/tests/disco_test.rb Modified: trunk/archipelago/lib/archipelago/disco.rb =================================================================== --- trunk/archipelago/lib/archipelago/disco.rb 2007-02-08 16:56:30 UTC (rev 207) +++ trunk/archipelago/lib/archipelago/disco.rb 2007-02-09 10:43:03 UTC (rev 208) @@ -249,7 +249,7 @@ # Returns our semi-unique id so that we can be found again. # def service_id - return @metadata["service_id"] + return @service_id ||= @metadata["service_id"] end end @@ -472,15 +472,15 @@ # # Recognized +event_types+: :found, :lost # - def subscribe(event_type, match, &block) - @service_change_subscribers_by_event_type[event_type][match] = block + def subscribe(event_type, match, identity, &block) + @service_change_subscribers_by_event_type[event_type][[match, identity]] = block end # # Will stop listening for +event_type+ and +match+. # - def unsubscribe(event_type, match) - @service_change_subscribers_by_event_type[event_type].delete(match) + def unsubscribe(event_type, match, identity) + @service_change_subscribers_by_event_type[event_type].delete([match, identity]) end # @@ -636,7 +636,8 @@ # Will notify all subscribers to +event_type+ looking for +record+. # def notify_subscribers(event_type, record) - @service_change_subscribers_by_event_type[event_type].each do |query, proc| + @service_change_subscribers_by_event_type[event_type].each do |query_and_identity, proc| + query = query_and_identity.first proc.call(record) if record.matches?(query) end end Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-08 16:56:30 UTC (rev 207) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 10:43:03 UTC (rev 208) @@ -62,17 +62,17 @@ def around_publish(&publish_block) yield - @officer.subscribe(:found, @officer.site_description) do |record| + @officer.subscribe(:found, @officer.site_description, service_id) do |record| found_peer(record) end - @officer.subscribe(:lost, @officer.site_description) do |record| + @officer.subscribe(:lost, @officer.site_description, service_id) do |record| lost_peer(record) end end def around_stop(&block) - @officer.unsubscribe(:found, @officer.site_description) - @officer.unsubscribe(:lost, @officer.site_description) + @officer.unsubscribe(:found, @officer.site_description, service_id) + @officer.unsubscribe(:lost, @officer.site_description, service_id) yield end @@ -109,8 +109,6 @@ @officer.next_to(service_id) end - private - def found_peer(record) Thread.new do if master_to?(record[:service_id]) Modified: trunk/archipelago/tests/disco_test.rb =================================================================== --- trunk/archipelago/tests/disco_test.rb 2007-02-08 16:56:30 UTC (rev 207) +++ trunk/archipelago/tests/disco_test.rb 2007-02-09 10:43:03 UTC (rev 208) @@ -69,10 +69,10 @@ found_it = false found_wrong = false - @d2.subscribe(:found, Archipelago::Disco::Query.new(:epa => "blar2")) do + @d2.subscribe(:found, Archipelago::Disco::Query.new(:epa => "blar2"), 1) do found_it = true end - @d2.subscribe(:found, Archipelago::Disco::Query.new(:epa => "blar2x")) do + @d2.subscribe(:found, Archipelago::Disco::Query.new(:epa => "blar2x"), 1) do found_wrong = true end @@ -100,13 +100,17 @@ lost_it = false lost_wrong = false - - @d2.subscribe(:lost, Archipelago::Disco::Query.new(:epa => "blar")) do + + @d2.subscribe(:lost, Archipelago::Disco::Query.new(:epa => "blar"), 1) do lost_it = true end - @d2.subscribe(:lost, Archipelago::Disco::Query.new(:epa => "blarx")) do + @d2.subscribe(:lost, Archipelago::Disco::Query.new(:epa => "blarx"), 1) do lost_wrong = true end + @d2.subscribe(:lost, Archipelago::Disco::Query.new(:epa => "blarx"), 2) do + lost_wrong = true + end + @d2.unsubscribe(:lost, Archipelago::Disco::Query.new(:epa => "blarx"), 2) assert(@d2.lookup(Archipelago::Disco::Query.new(:epa => "blar"), 0).empty?) From nobody at rubyforge.org Fri Feb 9 06:06:40 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Fri, 9 Feb 2007 06:06:40 -0500 (EST) Subject: [Archipelago-submits] [209] trunk/archipelago: made sanitation do stuff to remote sites in parallell Message-ID: <20070209110640.5C9FA5240A04@rubyforge.org> Revision: 209 Author: zond Date: 2007-02-09 06:06:40 -0500 (Fri, 09 Feb 2007) Log Message: ----------- made sanitation do stuff to remote sites in parallell Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/sanitation_test.rb Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 10:43:03 UTC (rev 208) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 11:06:40 UTC (rev 209) @@ -30,6 +30,9 @@ # # The server class in the Archipipelago::Dump network. # + # Uses an Archipelago::Sanitation::Officer to keep track of what is + # needed to be done for redundancy, but Site does the actual work. + # class Site # Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-09 10:43:03 UTC (rev 208) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-09 11:06:40 UTC (rev 209) @@ -17,6 +17,7 @@ require 'archipelago/client' require 'forwardable' +require 'monitor' module Archipelago @@ -75,7 +76,8 @@ # # Keeps track of our sites and writes and reads data. # - # Also keeps track of all the redundancy work needed. + # Also keeps track of all the redundancy work needed, but lets Site do the + # work. # class Officer < Archipelago::Client::Base extend Forwardable @@ -107,7 +109,7 @@ chunk_size = @minimum_recoverable_size / nr_of_needed_chunks if chunk_size < @minimum_recoverable_size / nr_of_needed_chunks dump_hash = responsible_sites(key, @minimum_nr_of_chunks) - dump_hash.each do |dump_id, nr_of_chunks_needed| + dump_hash.t_each do |dump_id, nr_of_chunks_needed| @sites[dump_id][:service].insert!(key, (0...nr_of_chunks_needed).collect do |nr_of_chunks_needed| t + super_string.encode(chunk_size) @@ -117,7 +119,7 @@ def delete!(key) dump_hash = responsible_sites(key, @minimum_nr_of_chunks) - dump_hash.each do |dump_id, nr_of_chunks_available| + dump_hash.t_each do |dump_id, nr_of_chunks_available| @sites[dump_id][:service].delete!(key) end end @@ -129,22 +131,27 @@ dump_hash = responsible_sites(key, @minimum_nr_of_chunks) dump_ids = dump_hash.keys newest_timestamp = "\000\000\000\000" + threads = [] + rval = Oneliner::SuperString.new + rval.extend(MonitorMixin) - rval = Oneliner::SuperString.new - while !rval.decode_done? && dump_ids.size > 0 - chunks = @sites[dump_ids.shift][:service].fetch(key) - while !rval.decode_done? && chunks.size > 0 - chunk = chunks.shift - t = chunk[0...4] - data = chunk[4..-1] - if t > newest_timestamp - rval = Oneliner::SuperString.new - newest_timestamp = t + dump_hash.t_each do |dump_id, nr_of_chunks_available| + site = @sites[dump_id][:service] + chunks = site.fetch(key) + rval.mon_synchronize do + while !rval.decode_done? && chunks.size > 0 + chunk = chunks.shift + t = chunk[0...4] + data = chunk[4..-1] + if t > newest_timestamp + rval = Oneliner::SuperString.new + newest_timestamp = t + end + + if t == newest_timestamp + rval.decode!(data) + end end - - if t == newest_timestamp - rval.decode!(data) - end end end @@ -175,6 +182,7 @@ raise NoRemoteDatabaseAvailableException.new(self) if @sites.empty? rval = {} + rval.extend(Archipelago::Current::ThreadedCollection) get_least_greater_than(@sites, key, n).each do |id| rval[id] ||= 0 rval[id] += 1 Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-09 10:43:03 UTC (rev 208) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-09 11:06:40 UTC (rev 209) @@ -77,7 +77,7 @@ def test_get_set @c["hej"] = "hoho" - assert_equal(@c["hej"], "hoho") + assert_equal("hoho", @c["hej"]) @c.delete!("hej") assert_equal(@c["hej"], nil) end From nobody at rubyforge.org Fri Feb 9 12:53:28 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Fri, 9 Feb 2007 12:53:28 -0500 (EST) Subject: [Archipelago-submits] [210] trunk/archipelago: made Dump#insert! check given timestamp to see if new chunks should be added. Message-ID: <20070209175328.69EB75240E3A@rubyforge.org> Revision: 210 Author: zond Date: 2007-02-09 12:53:27 -0500 (Fri, 09 Feb 2007) Log Message: ----------- made Dump#insert! check given timestamp to see if new chunks should be added. Made Sanitation#[]= add timestamps. Added Sanitation#redistribute that reuses timestamps. Modified Paths: -------------- trunk/archipelago/doc/dump maintenance.dia trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb Modified: trunk/archipelago/doc/dump maintenance.dia =================================================================== (Binary files differ) Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 11:06:40 UTC (rev 209) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 17:53:27 UTC (rev 210) @@ -79,12 +79,25 @@ yield end - def insert!(key, values) - @db.env.begin(BDB::TXN_COMMIT, @db) do |txn, db| - db.delete(key) + def insert!(key, values, timestamp = "") + if (duplicates = @db.duplicates(key)).empty? values.each do |value| - db[key] = value + @db[key] = timestamp + value end + else + my_timestamp = duplicates.first[0...4] + if timestamp != my_timestamp || duplicates.size > values.size + @db.env.begin(BDB::TXN_COMMIT, @db) do |txn, db| + db.delete(key) + values.each do |value| + db[key] = timestamp + value + end + end + else + values[0...(values.size - duplicates.size)].each do |value| + @db[key] = timestamp + value + end + end end end @@ -125,17 +138,6 @@ def lost_peer(record) end - # - # Ensures that all the dumps responsible for +key+ - # has chunks for that key. - # - def redistribute(key) - # Since fetching a key will try to reconstruct the value - # if bits are missing, and inserting a key makes sure it - # is present enough, this is as simple as get+set. - @officer[key] = @officer[key] - end - end end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-09 11:06:40 UTC (rev 209) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-09 17:53:27 UTC (rev 210) @@ -101,8 +101,9 @@ # # Write +key+ and +value+ into the site network with a good level of redundancy etc. # - def []=(key, value) - t = [Time.now.to_i].pack("I") + # Optionally the timestamp +t+ can be provided, but it defaults to now. + # + def []=(key, value, t = [Time.now.to_i].pack("I")) super_string = Oneliner::SuperString.new(value) nr_of_needed_chunks = @minimum_nr_of_chunks / @minimum_redundancy_ratio chunk_size = (super_string.size / nr_of_needed_chunks) + @metadata_overhead @@ -112,8 +113,9 @@ dump_hash.t_each do |dump_id, nr_of_chunks_needed| @sites[dump_id][:service].insert!(key, (0...nr_of_chunks_needed).collect do |nr_of_chunks_needed| - t + super_string.encode(chunk_size) - end) + super_string.encode(chunk_size) + end, + t) end end @@ -128,40 +130,7 @@ # Get the data for +key+ in the site network. # def [](key) - dump_hash = responsible_sites(key, @minimum_nr_of_chunks) - dump_ids = dump_hash.keys - newest_timestamp = "\000\000\000\000" - threads = [] - rval = Oneliner::SuperString.new - rval.extend(MonitorMixin) - - dump_hash.t_each do |dump_id, nr_of_chunks_available| - site = @sites[dump_id][:service] - chunks = site.fetch(key) - rval.mon_synchronize do - while !rval.decode_done? && chunks.size > 0 - chunk = chunks.shift - t = chunk[0...4] - data = chunk[4..-1] - if t > newest_timestamp - rval = Oneliner::SuperString.new - newest_timestamp = t - end - - if t == newest_timestamp - rval.decode!(data) - end - end - end - end - - if rval.decode_done? - return rval.to_s - else - raise NotEnoughDataException.new(self, key) if newest_timestamp != "\000\000\000\000" - return nil - end - + fetch(key).first end # @@ -211,9 +180,59 @@ @sites[get_least_greater_than(@sites, dump_id, 1).last][:service] end + # + # Ensures that all the dumps responsible for +key+ + # has chunks for that key without changing the timestamp + # for +key+. + # + def redistribute(key) + value, timestamp = fetch(key) + self.[]=(key, value, timestamp) + end + private # + # Returns [the value for +key+, the timestamp for the value]. + # + def fetch(key) + dump_hash = responsible_sites(key, @minimum_nr_of_chunks) + dump_ids = dump_hash.keys + newest_timestamp = "\000\000\000\000" + threads = [] + rval = Oneliner::SuperString.new + rval.extend(MonitorMixin) + + dump_hash.t_each do |dump_id, nr_of_chunks_available| + site = @sites[dump_id][:service] + chunks = site.fetch(key) + rval.mon_synchronize do + while !rval.decode_done? && chunks.size > 0 + chunk = chunks.shift + t = chunk[0...4] + data = chunk[4..-1] + if t > newest_timestamp + rval = Oneliner::SuperString.new + newest_timestamp = t + end + + if t == newest_timestamp + rval.decode!(data) + end + end + end + end + + if rval.decode_done? + return [rval.to_s, newest_timestamp] + else + raise NotEnoughDataException.new(self, key) if newest_timestamp != "\000\000\000\000" + return [nil, nil] + end + + end + + # # Gets the +n+ smallest keys from +hash+ that # are greater than +o+. # From nobody at rubyforge.org Fri Feb 9 13:58:13 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Fri, 9 Feb 2007 13:58:13 -0500 (EST) Subject: [Archipelago-submits] [211] trunk/archipelago/lib/archipelago: made dump use its own jockey to subscribe Message-ID: <20070209185813.9D30052423FB@rubyforge.org> Revision: 211 Author: zond Date: 2007-02-09 13:58:13 -0500 (Fri, 09 Feb 2007) Log Message: ----------- made dump use its own jockey to subscribe Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 17:53:27 UTC (rev 210) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 18:58:13 UTC (rev 211) @@ -65,17 +65,13 @@ def around_publish(&publish_block) yield - @officer.subscribe(:found, @officer.site_description, service_id) do |record| - found_peer(record) - end - @officer.subscribe(:lost, @officer.site_description, service_id) do |record| + @jockey.subscribe(:lost, @officer.site_description, service_id) do |record| lost_peer(record) end end def around_stop(&block) - @officer.unsubscribe(:found, @officer.site_description, service_id) - @officer.unsubscribe(:lost, @officer.site_description, service_id) + @jockey.unsubscribe(:lost, @officer.site_description, service_id) yield end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-09 17:53:27 UTC (rev 210) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-09 18:58:13 UTC (rev 211) @@ -16,7 +16,6 @@ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. require 'archipelago/client' -require 'forwardable' require 'monitor' module Archipelago @@ -80,9 +79,7 @@ # work. # class Officer < Archipelago::Client::Base - extend Forwardable attr_reader :sites, :site_description - def_delegators :@jockey, :subscribe, :unsubscribe def initialize(options = {}) super(options) From nobody at rubyforge.org Fri Feb 9 20:59:10 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Fri, 9 Feb 2007 20:59:10 -0500 (EST) Subject: [Archipelago-submits] [212] trunk/archipelago: removed lots of stuff that doesnt fit in with the new maintenance protocol plan Message-ID: <20070210015911.077EE524242F@rubyforge.org> Revision: 212 Author: zond Date: 2007-02-09 20:59:10 -0500 (Fri, 09 Feb 2007) Log Message: ----------- removed lots of stuff that doesnt fit in with the new maintenance protocol plan Modified Paths: -------------- trunk/archipelago/doc/dump maintenance.dia trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/sanitation_test.rb Modified: trunk/archipelago/doc/dump maintenance.dia =================================================================== (Binary files differ) Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-09 18:58:13 UTC (rev 211) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-10 01:59:10 UTC (rev 212) @@ -105,32 +105,10 @@ return @db.delete(key) end - def master_to?(other_id) - @officer.master_to?(service_id, other_id) - end - - def slave_to?(other_id) - @officer.master_to?(other_id, service_id) - end - - def last_slave - @officer.last_slave_to(service_id) - end - def next @officer.next_to(service_id) end - def found_peer(record) - Thread.new do - if master_to?(record[:service_id]) - - end - end - if slave_to?(record[:service_id]) - end - end - def lost_peer(record) end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-09 18:58:13 UTC (rev 211) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-10 01:59:10 UTC (rev 212) @@ -157,27 +157,6 @@ end # - # Returns whether +potential_master_id+ is supposed to be a master to +potential_slave_id+. - # - def master_to?(potential_master_id, potential_slave_id) - get_least_greater_than(@sites, potential_master_id, @minimum_nr_of_chunks - 1).include?(potential_slave_id) - end - - # - # Returns the last slave to +master_id+. - # - def last_slave_to(master_id) - @sites[get_least_greater_than(@sites, master_id, @minimum_nr_of_chunks - 1).last][:service] - end - - # - # Returns the next site after +dump_id+. - # - def next_to(dump_id) - @sites[get_least_greater_than(@sites, dump_id, 1).last][:service] - end - - # # Ensures that all the dumps responsible for +key+ # has chunks for that key without changing the timestamp # for +key+. Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-09 18:58:13 UTC (rev 211) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-10 01:59:10 UTC (rev 212) @@ -43,30 +43,6 @@ s.include?(dump.service_id) end end - 10.times do |pot_mast| - 10.times do |pot_slave| - if (d = pot_slave - pot_mast) > 0 && d < 3 - assert(dumps[pot_mast].master_to?(dumps[pot_slave].service_id), - "#{dumps[pot_mast].service_id}.master_to?(#{dumps[pot_slave].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") - assert(dumps[pot_slave].slave_to?(dumps[pot_mast].service_id), - "#{dumps[pot_slave].service_id}.slave_to?(#{dumps[pot_mast].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") - end - if (d = (dumps.size + pot_slave) - pot_mast) > 0 && d < 3 - assert(dumps[pot_mast].master_to?(dumps[pot_slave].service_id), - "#{dumps[pot_mast].service_id}.master_to?(#{dumps[pot_slave].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") - assert(dumps[pot_slave].slave_to?(dumps[pot_mast].service_id), - "#{dumps[pot_slave].service_id}.slave_to?(#{dumps[pot_mast].service_id}) should be true but is not. known dumps are #{cleaner2.sites.keys.inspect}") - end - end - end - 9.times do |site_id| - assert_equal(dumps[site_id + 1].service_id, dumps[site_id].next.service_id) - end - assert_equal(dumps.first.service_id, dumps.last.next.service_id) - dumps.each do |dump| - assert_equal(dump.next.next.service_id, - dump.last_slave.service_id) - end ensure dumps.each do |dump| dump.stop! From nobody at rubyforge.org Wed Feb 14 11:04:11 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Wed, 14 Feb 2007 11:04:11 -0500 (EST) Subject: [Archipelago-submits] [213] trunk/oneliner/lib/oneliner/superstring.rb: better faster decode_done? Message-ID: <20070214160411.57AEA524098F@rubyforge.org> Revision: 213 Author: zond Date: 2007-02-14 11:04:11 -0500 (Wed, 14 Feb 2007) Log Message: ----------- better faster decode_done? Modified Paths: -------------- trunk/oneliner/lib/oneliner/superstring.rb Modified: trunk/oneliner/lib/oneliner/superstring.rb =================================================================== --- trunk/oneliner/lib/oneliner/superstring.rb 2007-02-10 01:59:10 UTC (rev 212) +++ trunk/oneliner/lib/oneliner/superstring.rb 2007-02-14 16:04:11 UTC (rev 213) @@ -112,7 +112,9 @@ # Returns whether decoding is done. # def decode_done? - if @decode_done.nil? && @nr_of_data_blocks + return false unless defined?(@decode_done) && defined?(@nr_of_data_blocks) + + unless @decode_done data = @blocks[0... at nr_of_data_blocks] @decode_done = data.compact.size == data.size end From nobody at rubyforge.org Wed Feb 14 12:33:52 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Wed, 14 Feb 2007 12:33:52 -0500 (EST) Subject: [Archipelago-submits] [214] trunk/archipelago: added found-node maintenance to dump. Message-ID: <20070214173352.E31AE52409AA@rubyforge.org> Revision: 214 Author: zond Date: 2007-02-14 12:33:52 -0500 (Wed, 14 Feb 2007) Log Message: ----------- added found-node maintenance to dump. added identification methods to sanitation and dump. added tests. Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/sanitation_test.rb Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-14 16:04:11 UTC (rev 213) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-14 17:33:52 UTC (rev 214) @@ -36,6 +36,11 @@ class Site # + # The minimum pause between checking if keys belong with us. + # + CHECK_INTERVAL = 30 + + # # The Site can be published. # include Archipelago::Disco::Publishable @@ -57,6 +62,11 @@ @db = @persistence_provider.get_dup_tree("db") # + # The minimum pause between checking if keys belong with us. + # + @check_interval = options[:check_interval] || CHECK_INTERVAL + + # # Use the given options to initialize the publishable # instance variables. # @@ -68,10 +78,52 @@ @jockey.subscribe(:lost, @officer.site_description, service_id) do |record| lost_peer(record) end + start_edge_check end + def check_key(key) + if belongs_here?(key) + return true + else + begin + @officer.redistribute(key) + @db.delete(key) + rescue Archipelago::Sanitation::NotEnoughDataException => e + # What shall we do in this case? + # * Nothing, and wait for the admins to make a manual check? + # * Send an email someplace? + # * Store it somewhere so that the manual check goes faster? + ensure + return false + end + end + end + + def belongs_here?(key) + @officer.belongs_at?(self.service_id, key) + end + + def start_edge_check + @edge_check_thread = Thread.new do + loop do + begin + @db.reverse_each_key("ffffffffffffffffffffffffffffffffffffffff") do |key| + break if check_key(key) + end + @db.each_key do |key| + break if check_key(key) + end + rescue Exception => e + # /moo + end + sleep(@check_interval) + end + end + end + def around_stop(&block) @jockey.unsubscribe(:lost, @officer.site_description, service_id) + @edge_check_thread.kill yield end @@ -105,11 +157,19 @@ return @db.delete(key) end - def next - @officer.next_to(service_id) + def right_before?(other_service_id) + @officer.next_to?(service_id, other_service_id) end + def right_after?(other_service_id) + @officer.next_to?(other_service_id, service_id) + end + def lost_peer(record) + if right_before?(record[:service_id]) + end + if right_after?(record[:service_id]) + end end end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-14 16:04:11 UTC (rev 213) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-14 17:33:52 UTC (rev 214) @@ -98,6 +98,8 @@ # # Write +key+ and +value+ into the site network with a good level of redundancy etc. # + # The key should must be a SHA1 hash. + # # Optionally the timestamp +t+ can be provided, but it defaults to now. # def []=(key, value, t = [Time.now.to_i].pack("I")) @@ -106,7 +108,7 @@ chunk_size = (super_string.size / nr_of_needed_chunks) + @metadata_overhead chunk_size = @minimum_recoverable_size / nr_of_needed_chunks if chunk_size < @minimum_recoverable_size / nr_of_needed_chunks - dump_hash = responsible_sites(key, @minimum_nr_of_chunks) + dump_hash = responsible_sites(key) dump_hash.t_each do |dump_id, nr_of_chunks_needed| @sites[dump_id][:service].insert!(key, (0...nr_of_chunks_needed).collect do |nr_of_chunks_needed| @@ -117,7 +119,7 @@ end def delete!(key) - dump_hash = responsible_sites(key, @minimum_nr_of_chunks) + dump_hash = responsible_sites(key) dump_hash.t_each do |dump_id, nr_of_chunks_available| @sites[dump_id][:service].delete!(key) end @@ -126,6 +128,8 @@ # # Get the data for +key+ in the site network. # + # The key must be a SHA1 hash. + # def [](key) fetch(key).first end @@ -144,12 +148,12 @@ # # Will loop to the beginning if the number of elements run out. # - def responsible_sites(key, n) + def responsible_sites(key) raise NoRemoteDatabaseAvailableException.new(self) if @sites.empty? rval = {} rval.extend(Archipelago::Current::ThreadedCollection) - get_least_greater_than(@sites, key, n).each do |id| + get_least_greater_than(@sites, key, @minimum_nr_of_chunks).each do |id| rval[id] ||= 0 rval[id] += 1 end @@ -157,22 +161,41 @@ end # + # Returns whether the key belongs at the service with given id. + # + def belongs_at?(service_id, key) + responsible_sites(key).include?(service_id) + end + + # # Ensures that all the dumps responsible for +key+ # has chunks for that key without changing the timestamp # for +key+. # def redistribute(key) value, timestamp = fetch(key) + # + # Even if fetch didnt raise the exception we must, cause this is serious business. + # + raise NotEnoughDataException.new(self, key) if value.nil? self.[]=(key, value, timestamp) end + # + # Returns whether +service_id1+ and +service_id2+ + # are in that order in the array. + # + def next_to?(service_id1, service_id2) + return @sites.include?(service_id1) && get_least_greater_than(@sites, service_id1, 1).first == service_id2 + end + private # # Returns [the value for +key+, the timestamp for the value]. # def fetch(key) - dump_hash = responsible_sites(key, @minimum_nr_of_chunks) + dump_hash = responsible_sites(key) dump_ids = dump_hash.keys newest_timestamp = "\000\000\000\000" threads = [] Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-14 16:04:11 UTC (rev 213) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-14 17:33:52 UTC (rev 214) @@ -43,6 +43,15 @@ s.include?(dump.service_id) end end + 10.times do |n| + if n < 9 + assert(dumps[n].right_before?(dumps[n+1].service_id), "#{dumps[n].service_id} is supposed to be right_before? #{dumps[n+1].service_id}, but isnt") + assert(dumps[n+1].right_after?(dumps[n].service_id), "#{dumps[n+1].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") + else + assert(dumps[n].right_before?(dumps[0].service_id), "#{dumps[n].service_id} is supposed to be right_before? #{dumps[0].service_id}, but isnt") + assert(dumps[0].right_after?(dumps[n].service_id), "#{dumps[0].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") + end + end ensure dumps.each do |dump| dump.stop! From nobody at rubyforge.org Sun Feb 18 10:36:36 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Sun, 18 Feb 2007 10:36:36 -0500 (EST) Subject: [Archipelago-submits] [215] trunk/archipelago: added tests to sanitation and dump. Message-ID: <20070218153636.4C76A5240A4E@rubyforge.org> Revision: 215 Author: zond Date: 2007-02-18 10:36:35 -0500 (Sun, 18 Feb 2007) Log Message: ----------- added tests to sanitation and dump. added a fancy each method to dump. fixed bugs found by the tests Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/dump_test.rb trunk/archipelago/tests/sanitation_test.rb trunk/archipelago/tests/test_helper.rb Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-14 17:33:52 UTC (rev 214) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-18 15:36:35 UTC (rev 215) @@ -165,8 +165,16 @@ @officer.next_to?(other_service_id, service_id) end + def each(options = {}, &block) + @db.each(options[:start]) do |key, value| + break if options[:stop] && key > options[:stop] + yield(key, value) + end + end + def lost_peer(record) if right_before?(record[:service_id]) + end if right_after?(record[:service_id]) end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-14 17:33:52 UTC (rev 214) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-18 15:36:35 UTC (rev 215) @@ -189,6 +189,8 @@ return @sites.include?(service_id1) && get_least_greater_than(@sites, service_id1, 1).first == service_id2 end + + private # @@ -212,6 +214,7 @@ data = chunk[4..-1] if t > newest_timestamp rval = Oneliner::SuperString.new + rval.extend(MonitorMixin) newest_timestamp = t end Modified: trunk/archipelago/tests/dump_test.rb =================================================================== --- trunk/archipelago/tests/dump_test.rb 2007-02-14 17:33:52 UTC (rev 214) +++ trunk/archipelago/tests/dump_test.rb 2007-02-18 15:36:35 UTC (rev 215) @@ -34,4 +34,40 @@ @d.fetch("key2")) end + def test_too_few + @d.instance_eval do + @db["a"] = "aaaab" + @db["a"] = "aaaac" + end + assert_equal(["aaaab","aaaac"].sort, + @d.fetch("a").sort) + @d.insert!("a", ["aaaab","aaaac","aaaad"]) + assert_equal(["aaaab","aaaac","aaaad"].sort, + @d.fetch("a").sort) + end + + def test_too_many + @d.instance_eval do + @db["a"] = "aaaab" + @db["a"] = "aaaac" + end + assert_equal(["aaaab","aaaac"].sort, + @d.fetch("a").sort) + @d.insert!("a", ["aaaad"]) + assert_equal(["aaaad"], + @d.fetch("a")) + end + + def test_wrong_timestamp + @d.instance_eval do + @db["a"] = "aaaab" + @db["a"] = "aaaac" + end + assert_equal(["aaaab","aaaac"].sort, + @d.fetch("a").sort) + @d.insert!("a", ["aaabb", "aaabc"]) + assert_equal(["aaabb","aaabc"].sort, + @d.fetch("a").sort) + end + end Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-14 17:33:52 UTC (rev 214) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-18 15:36:35 UTC (rev 215) @@ -25,6 +25,14 @@ DRb.stop_service end + def test_missing_bits + s1 = Oneliner::SuperString.new("brappa") + @d.insert!("a", "aaaa" + s1.encode(12)) + assert_raise(Archipelago::Sanitation::NotEnoughDataException) do + @c["a"] + end + end + def test_navigation @d.stop! dumps = [] @@ -52,6 +60,11 @@ assert(dumps[0].right_after?(dumps[n].service_id), "#{dumps[0].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") end end + s1 = Oneliner::SuperString.new("brappa") + s2 = Oneliner::SuperString.new("brappa2") + dumps[0].insert!("a", "aaaa" + s1.encode(20)) + dumps[1].insert!("a", "aaab" + s2.encode(200)) + assert_equal("brappa2", cleaner2["a"]) ensure dumps.each do |dump| dump.stop! Modified: trunk/archipelago/tests/test_helper.rb =================================================================== --- trunk/archipelago/tests/test_helper.rb 2007-02-14 17:33:52 UTC (rev 214) +++ trunk/archipelago/tests/test_helper.rb 2007-02-18 15:36:35 UTC (rev 215) @@ -17,6 +17,7 @@ require 'socket' require 'ipaddr' require 'thread' +require 'rubygems' class TestTransaction def join(o) From nobody at rubyforge.org Sun Feb 18 13:30:31 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Sun, 18 Feb 2007 13:30:31 -0500 (EST) Subject: [Archipelago-submits] [216] trunk/archipelago: refactored the api between Dump and Cleaner, modified the tests accordingly, added tests, refactored Cleaner util methods. Message-ID: <20070218183031.88A145240AB3@rubyforge.org> Revision: 216 Author: zond Date: 2007-02-18 13:30:31 -0500 (Sun, 18 Feb 2007) Log Message: ----------- refactored the api between Dump and Cleaner, modified the tests accordingly, added tests, refactored Cleaner util methods. Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/dump_test.rb trunk/archipelago/tests/sanitation_test.rb Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-18 15:36:35 UTC (rev 215) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-18 18:30:31 UTC (rev 216) @@ -127,7 +127,7 @@ yield end - def insert!(key, values, timestamp = "") + def insert!(key, values, timestamp = "\000\000\000\000") if (duplicates = @db.duplicates(key)).empty? values.each do |value| @db[key] = timestamp + value @@ -150,7 +150,9 @@ end def fetch(key) - return @db.duplicates(key) + values = @db.duplicates(key).collect do |value| + [value[0...4], value[4..-1]] + end end def delete!(key) @@ -174,9 +176,10 @@ def lost_peer(record) if right_before?(record[:service_id]) - + @officer.redistribute_keys_after(service_id) end if right_after?(record[:service_id]) + @officer.redistribute_keys_before(service_id) end end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-18 15:36:35 UTC (rev 215) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-18 18:30:31 UTC (rev 216) @@ -189,7 +189,12 @@ return @sites.include?(service_id1) && get_least_greater_than(@sites, service_id1, 1).first == service_id2 end + def redistribute_keys_after(service_id) + + end + def redistribute_keys_before(service_id) + end private @@ -209,9 +214,7 @@ chunks = site.fetch(key) rval.mon_synchronize do while !rval.decode_done? && chunks.size > 0 - chunk = chunks.shift - t = chunk[0...4] - data = chunk[4..-1] + t, data = chunks.shift if t > newest_timestamp rval = Oneliner::SuperString.new rval.extend(MonitorMixin) @@ -241,14 +244,33 @@ # Will loop to the beginning if the number of elements run out. # def get_least_greater_than(hash, o, n) - sorted_key_array = hash.keys.sort - 0.upto(sorted_key_array.size - 1) do |index| - key = sorted_key_array[index] - if key > o - return get_some(sorted_key_array, index, n) + return get_matching(hash.keys.sort, o, n, :>) + end + + # + # Gets the +n+ largest keys from +hash+ that + # are less than +o+. + # + # Will loop to the end if the number of elements run out. + # + def get_greatest_less_than(hash, o, n) + return get_matching(hash.keys.sort.reverse, o, n, :<) + end + + # + # Will get the +n+ consecutive elements of +list+ where + # [the first one].send(+operator+, +o+) returns true. + # + # If none matches, will return the +n+ first elements. + # + def get_matching(list, o, n, operator) + list.size.times do |index| + key = list[index] + if key.send(operator, o) + return get_some(list, index, n) end end - return get_some(sorted_key_array, 0, n) + return get_some(list, 0, n) end # Modified: trunk/archipelago/tests/dump_test.rb =================================================================== --- trunk/archipelago/tests/dump_test.rb 2007-02-18 15:36:35 UTC (rev 215) +++ trunk/archipelago/tests/dump_test.rb 2007-02-18 18:30:31 UTC (rev 216) @@ -15,59 +15,56 @@ def test_insert_fetch @d.insert!("key", ["value", "value1"]) + + assert_equal(["value", "value1"].sort, + @d.fetch("key").collect do |e| e.last end.sort) + @d.insert!("key2", ["value2"]) @d.insert!("key2", ["value3"]) - assert_equal(["value", "value1"].sort, - @d.fetch("key").sort) - assert_equal(["value3"].sort, - @d.fetch("key2").sort) + assert_equal(["value2"], + @d.fetch("key2").collect do |e| e.last end.sort) - @d.insert!("key", ["value2", "value3"]) + @d.insert!("key2", ["value3"], "hehu") + assert_equal(["value3"], + @d.fetch("key2").collect do |e| e.last end.sort) + + @d.insert!("key", ["value2", "value3"], "apap") assert_equal(["value2", "value3"].sort, - @d.fetch("key").sort) + @d.fetch("key").collect do |e| e.last end.sort) @d.delete!("key") @d.delete!("key2") assert_equal([], - @d.fetch("key")) + @d.fetch("key").collect do |e| e.last end) assert_equal([], - @d.fetch("key2")) + @d.fetch("key2").collect do |e| e.last end) end def test_too_few - @d.instance_eval do - @db["a"] = "aaaab" - @db["a"] = "aaaac" - end - assert_equal(["aaaab","aaaac"].sort, - @d.fetch("a").sort) - @d.insert!("a", ["aaaab","aaaac","aaaad"]) - assert_equal(["aaaab","aaaac","aaaad"].sort, - @d.fetch("a").sort) + @d.insert!("a", ["b","c"]) + assert_equal(["b","c"].sort, + @d.fetch("a").collect do |e| e.last end.sort) + @d.insert!("a", ["d","c","b"]) + assert_equal(["b","c","d"].sort, + @d.fetch("a").collect do |e| e.last end.sort) end def test_too_many - @d.instance_eval do - @db["a"] = "aaaab" - @db["a"] = "aaaac" - end - assert_equal(["aaaab","aaaac"].sort, - @d.fetch("a").sort) - @d.insert!("a", ["aaaad"]) - assert_equal(["aaaad"], - @d.fetch("a")) + @d.insert!("a", ["b","c"]) + assert_equal(["b","c"].sort, + @d.fetch("a").collect do |e| e.last end.sort) + @d.insert!("a", ["d"]) + assert_equal(["d"], + @d.fetch("a").collect do |e| e.last end) end def test_wrong_timestamp - @d.instance_eval do - @db["a"] = "aaaab" - @db["a"] = "aaaac" - end - assert_equal(["aaaab","aaaac"].sort, - @d.fetch("a").sort) - @d.insert!("a", ["aaabb", "aaabc"]) - assert_equal(["aaabb","aaabc"].sort, - @d.fetch("a").sort) + @d.insert!("a", ["b","c"], "epap") + assert_equal(["b","c"].sort, + @d.fetch("a").collect do |e| e.last end.sort) + @d.insert!("a", ["x", "y"], "epao") + assert_equal(["x","y"].sort, + @d.fetch("a").collect do |e| e.last end.sort) end end Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-18 15:36:35 UTC (rev 215) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-18 18:30:31 UTC (rev 216) @@ -27,7 +27,7 @@ def test_missing_bits s1 = Oneliner::SuperString.new("brappa") - @d.insert!("a", "aaaa" + s1.encode(12)) + @d.insert!("a", s1.encode(9), "abab") assert_raise(Archipelago::Sanitation::NotEnoughDataException) do @c["a"] end @@ -62,9 +62,17 @@ end s1 = Oneliner::SuperString.new("brappa") s2 = Oneliner::SuperString.new("brappa2") - dumps[0].insert!("a", "aaaa" + s1.encode(20)) - dumps[1].insert!("a", "aaab" + s2.encode(200)) + dumps[0].insert!("a", s1.encode(20), "aaaa") + dumps[1].insert!("a", s2.encode(200), "aaab") assert_equal("brappa2", cleaner2["a"]) + assert_equal([dumps[9].service_id, dumps[0].service_id, dumps[1].service_id].sort, + cleaner2.instance_eval do + get_greatest_less_than(@sites, dumps[2].service_id, 3) + end.sort) + assert_equal([dumps[8].service_id, dumps[9].service_id, dumps[0].service_id].sort, + cleaner2.instance_eval do + get_least_greater_than(@sites, dumps[7].service_id, 3) + end.sort) ensure dumps.each do |dump| dump.stop! From nobody at rubyforge.org Sun Feb 18 14:44:48 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Sun, 18 Feb 2007 14:44:48 -0500 (EST) Subject: [Archipelago-submits] [217] trunk/archipelago: made lots of methods in dump private. Message-ID: <20070218194448.4CF1D5240AB3@rubyforge.org> Revision: 217 Author: zond Date: 2007-02-18 14:44:48 -0500 (Sun, 18 Feb 2007) Log Message: ----------- made lots of methods in dump private. commented dump.rb properly. added the last parts of the maintenance protocol to dump and sanitation - all it needs now is proper testing :O. modified tests. Modified Paths: -------------- trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/tests/sanitation_test.rb Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-18 18:30:31 UTC (rev 216) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-18 19:44:48 UTC (rev 217) @@ -73,6 +73,65 @@ initialize_publishable(options) end + # + # Will insert the array +values+ under +key+ in the db. + # + # Will insert them with the +timestamp+ or "\000\000\000\000". + # + # If values with the same timestamp already exists, it will + # overwrite them if they are greater in number than the given + # +values+. If they are fewer enough of the given +values+ + # will be inserted to get the same number of values in the + # database as in +values. + # + # Any values with a different timestamp will be deleted. + # + def insert!(key, values, timestamp = "\000\000\000\000") + if (duplicates = @db.duplicates(key)).empty? + values.each do |value| + @db[key] = timestamp + value + end + else + my_timestamp = duplicates.first[0...4] + if timestamp != my_timestamp || duplicates.size > values.size + @db.env.begin(BDB::TXN_COMMIT, @db) do |txn, db| + db.delete(key) + values.each do |value| + db[key] = timestamp + value + end + end + else + values[0...(values.size - duplicates.size)].each do |value| + @db[key] = timestamp + value + end + end + end + end + + # + # Fetches all duplicates of +key+. + # + # Returns [[TIMESTAMP0, VALUE0],...,[TIMESTAMPn, VALUEn]] + # + def fetch(key) + values = @db.duplicates(key).collect do |value| + [value[0...4], value[4..-1]] + end + end + + # + # Deletes +key+ from the db. + # + def delete!(key) + return @db.delete(key) + end + + private + + # + # After we have been published we will subscribe to other Archipelago::Dump::Site + # changes and start a thread that checks our edge keys. + # def around_publish(&publish_block) yield @jockey.subscribe(:lost, @officer.site_description, service_id) do |record| @@ -81,6 +140,26 @@ start_edge_check end + # + # Before we stop we will unsubscribe from other Archipelago::Dump::Site + # changes and kill our @edge_check_thread. + # + def around_stop(&block) + @jockey.unsubscribe(:lost, @officer.site_description, service_id) + @edge_check_thread.kill + yield + end + + # + # Checks whether +key+ should reside here. + # + # If not, will ask our Archipelago::Sanitation::Officer to + # redistribute it. + # + # If that succeeded will delete it from our database. + # + # Will return whether it belongs here. + # def check_key(key) if belongs_here?(key) return true @@ -99,15 +178,22 @@ end end + # + # Asks our Archipelago::Sanitation::Officer if +key+ + # should reside with us. + # def belongs_here?(key) @officer.belongs_at?(self.service_id, key) end + # + # Starts a Thread that checks our first and last keys. + # def start_edge_check @edge_check_thread = Thread.new do loop do begin - @db.reverse_each_key("ffffffffffffffffffffffffffffffffffffffff") do |key| + @db.reverse_each_key do |key| break if check_key(key) end @db.each_key do |key| @@ -121,65 +207,58 @@ end end - def around_stop(&block) - @jockey.unsubscribe(:lost, @officer.site_description, service_id) - @edge_check_thread.kill - yield + # + # Returns whether +other_service_id+ is right before + # us in the big scheme of things. + # + def right_after?(other_service_id) + @officer.next_to?(other_service_id, service_id) end - def insert!(key, values, timestamp = "\000\000\000\000") - if (duplicates = @db.duplicates(key)).empty? - values.each do |value| - @db[key] = timestamp + value - end - else - my_timestamp = duplicates.first[0...4] - if timestamp != my_timestamp || duplicates.size > values.size - @db.env.begin(BDB::TXN_COMMIT, @db) do |txn, db| - db.delete(key) - values.each do |value| - db[key] = timestamp + value - end - end - else - values[0...(values.size - duplicates.size)].each do |value| - @db[key] = timestamp + value - end - end - end - end - - def fetch(key) - values = @db.duplicates(key).collect do |value| - [value[0...4], value[4..-1]] - end - end - - def delete!(key) - return @db.delete(key) - end - + # + # Returns whether +other_service_id+ is right after + # us in the big scheme of things. + # def right_before?(other_service_id) @officer.next_to?(service_id, other_service_id) end - def right_after?(other_service_id) - @officer.next_to?(other_service_id, service_id) - end - - def each(options = {}, &block) - @db.each(options[:start]) do |key, value| - break if options[:stop] && key > options[:stop] - yield(key, value) + # + # Tell the officer to redistribute +key+ and ignore any errors. + # + def redistribute_key(key) + begin + @officer.redistribute(key) + rescue Archipelago::Sanitation::NotEnoughDataException => e + # What shall we do in this case? + # * Nothing, and wait for the admins to make a manual check? + # * Send an email someplace? + # * Store it somewhere so that the manual check goes faster? end end + # + # When we have lost a peer +record+ we must check if it was a neighbour of ours. + # + # If it was, then we must redistribute its keys. + # + # If it was our forward neighbour we will redistribute everything that + # we hold that we know that it as well held - everything from and including + # our second to first master upto and including ourselves. + # + # If it was our backward neighbour we will redistribute everything that + # it was master to. + # def lost_peer(record) if right_before?(record[:service_id]) - @officer.redistribute_keys_after(service_id) + @db.each_key(@officer.second_master_to(service_id)) do |key| + redistribute_key(key) + end end if right_after?(record[:service_id]) - @officer.redistribute_keys_before(service_id) + @db.reverse_each_key(record[:service_id]) do |key| + redistribute_key(key) + end end end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-18 18:30:31 UTC (rev 216) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-18 19:44:48 UTC (rev 217) @@ -189,13 +189,14 @@ return @sites.include?(service_id1) && get_least_greater_than(@sites, service_id1, 1).first == service_id2 end - def redistribute_keys_after(service_id) - + # + # Returns the site after the first one that has keys that + # will be stored in the site identified by +service_id+. + # + def second_master_to(service_id) + return get_greatest_less_than(@sites, service_id, @minimum_nr_of_chunks - 1).first end - def redistribute_keys_before(service_id) - end - private # Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-18 18:30:31 UTC (rev 216) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-18 19:44:48 UTC (rev 217) @@ -53,11 +53,15 @@ end 10.times do |n| if n < 9 - assert(dumps[n].right_before?(dumps[n+1].service_id), "#{dumps[n].service_id} is supposed to be right_before? #{dumps[n+1].service_id}, but isnt") - assert(dumps[n+1].right_after?(dumps[n].service_id), "#{dumps[n+1].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") + assert(dumps[n].instance_eval do right_before?(dumps[n+1].service_id) end, + "#{dumps[n].service_id} is supposed to be right_before? #{dumps[n+1].service_id}, but isnt") + assert(dumps[n+1].instance_eval do right_after?(dumps[n].service_id) end, + "#{dumps[n+1].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") else - assert(dumps[n].right_before?(dumps[0].service_id), "#{dumps[n].service_id} is supposed to be right_before? #{dumps[0].service_id}, but isnt") - assert(dumps[0].right_after?(dumps[n].service_id), "#{dumps[0].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") + assert(dumps[n].instance_eval do right_before?(dumps[0].service_id) end, + "#{dumps[n].service_id} is supposed to be right_before? #{dumps[0].service_id}, but isnt") + assert(dumps[0].instance_eval do right_after?(dumps[n].service_id) end, + "#{dumps[0].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") end end s1 = Oneliner::SuperString.new("brappa") @@ -73,6 +77,10 @@ cleaner2.instance_eval do get_least_greater_than(@sites, dumps[7].service_id, 3) end.sort) + assert_equal(dumps[3].service_id, + cleaner2.second_master_to(dumps[4].service_id)) + assert_equal(dumps[9].service_id, + cleaner2.second_master_to(dumps[0].service_id)) ensure dumps.each do |dump| dump.stop! From nobody at rubyforge.org Thu Feb 22 17:07:10 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 22 Feb 2007 17:07:10 -0500 (EST) Subject: [Archipelago-submits] [218] trunk/oneliner/lib/oneliner/superstring.rb: guarded against some strange long problems Message-ID: <20070222220710.A66F552421B7@rubyforge.org> Revision: 218 Author: zond Date: 2007-02-22 17:07:09 -0500 (Thu, 22 Feb 2007) Log Message: ----------- guarded against some strange long problems Modified Paths: -------------- trunk/oneliner/lib/oneliner/superstring.rb Modified: trunk/oneliner/lib/oneliner/superstring.rb =================================================================== --- trunk/oneliner/lib/oneliner/superstring.rb 2007-02-18 19:44:48 UTC (rev 217) +++ trunk/oneliner/lib/oneliner/superstring.rb 2007-02-22 22:07:09 UTC (rev 218) @@ -46,8 +46,8 @@ context = Context.new context.seed(seed) - rval = [self.size].pack("L*") - rval << [seed].pack("L*") + rval = [self.size].pack("i") + rval << [seed].pack("i") blocks = [] wanted_blocks = ((requested_size - 8) * 8) / @block_size @@ -71,11 +71,11 @@ context = Context.new - requested_size = chunk[0..3].unpack("L*").first + requested_size = chunk[0..3].unpack("i").first ensure_decode_format(requested_size) - the_seed = chunk[4..7].unpack("L*").first + the_seed = chunk[4..7].unpack("i").first chunk_data = { :blocks => expand(chunk[8..-1]), From nobody at rubyforge.org Thu Feb 22 17:33:52 2007 From: nobody at rubyforge.org (nobody at rubyforge.org) Date: Thu, 22 Feb 2007 17:33:52 -0500 (EST) Subject: [Archipelago-submits] [219] trunk/archipelago: fixed a few shaky tests. Message-ID: <20070222223352.7826752421B7@rubyforge.org> Revision: 219 Author: zond Date: 2007-02-22 17:33:52 -0500 (Thu, 22 Feb 2007) Log Message: ----------- fixed a few shaky tests. added an officer script. added it to the console script. added a dump to the services script. improved the pirate script. made Disco::Jockey notify a bit thriftier. added lots of debug to dump. made sanitation safer. still have massive problems with dropping services :/ Modified Paths: -------------- trunk/archipelago/lib/archipelago/disco.rb trunk/archipelago/lib/archipelago/dump.rb trunk/archipelago/lib/archipelago/sanitation.rb trunk/archipelago/script/console trunk/archipelago/script/pirate.rb trunk/archipelago/script/services.rb trunk/archipelago/tests/disco_test.rb trunk/archipelago/tests/sanitation_test.rb Added Paths: ----------- trunk/archipelago/script/officer.rb Modified: trunk/archipelago/lib/archipelago/disco.rb =================================================================== --- trunk/archipelago/lib/archipelago/disco.rb 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/lib/archipelago/disco.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -362,16 +362,17 @@ # Set +key+ to +value+. # def []=(key, value) - @jockey.instance_eval do notify_subscribers(:found, value) end + if @jockey && !@hash.include?(key) + @jockey.instance_eval do notify_subscribers(:found, value) end + end @hash[key] = value end # # Delete +key+. # def delete(key) - value = @hash[key] - @jockey.instance_eval do notify_subscribers(:lost, value) end if value - @hash.delete(key) + value = @hash.delete(key) + @jockey.instance_eval do notify_subscribers(:lost, value) end if @jockey && value end # # Merge this locker with another. @@ -379,13 +380,13 @@ def merge(sd) rval = @hash.clone rval.merge!(sd.hash) - ServiceLocker.new(:hash => rval, :jockey => @jockey) + ServiceLocker.new(:hash => rval) end # # Find all containing services matching +match+. # def get_services(match) - rval = ServiceLocker.new(:jockey => @jockey) + rval = ServiceLocker.new self.clone.each do |service_id, service_data| if service_data.matches?(match) if service_data.valid? @@ -602,7 +603,7 @@ @outgoing << [nil, match] end - ServiceLocker.new(:jockey => self) + ServiceLocker.new end # Modified: trunk/archipelago/lib/archipelago/dump.rb =================================================================== --- trunk/archipelago/lib/archipelago/dump.rb 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/lib/archipelago/dump.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -22,6 +22,7 @@ require 'archipelago/sanitation' require 'pathname' require 'bdb' +require 'drb' module Archipelago @@ -45,6 +46,8 @@ # include Archipelago::Disco::Publishable + attr_accessor :db, :debug_callable, :officer, :persistence_provider + def initialize(options = {}) # # The provider of happy magic persistent hashes of different kinds. @@ -52,9 +55,14 @@ @persistence_provider = options[:persistence_provider] || Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.expand_path(__FILE__)).parent.join("cove_tanker.db")) # + # The callable object that will get sent our debug messages if it exists. + # + @debug_callable = options[:debug_callable] + + # # The provider of checksumming magic and chunk distribution. # - @officer = options[:officer] || (defined?(Archipelago::Sanitation::CLEANER) || Archipelago::Sanitation::Officer.new) + @officer = options[:officer] || (defined?(Archipelago::Sanitation::CLEANER) ? Archipelago::Sanitation::CLEANER : Archipelago::Sanitation::Officer.new) # # The database where the data lives. @@ -162,9 +170,11 @@ # def check_key(key) if belongs_here?(key) + @debug_callable.call("#{service_id}.check_key(#{key}) returns true") if @debug_callable return true else begin + @debug_callable.call("#{service_id}.check_key(#{key}) redistributes, deletes and returns false") if @debug_callable @officer.redistribute(key) @db.delete(key) rescue Archipelago::Sanitation::NotEnoughDataException => e @@ -193,6 +203,7 @@ @edge_check_thread = Thread.new do loop do begin + @debug_callable.call("#{service_id}.start_edge_check doing its thang") if @debug_callable @db.reverse_each_key do |key| break if check_key(key) end @@ -208,16 +219,16 @@ end # - # Returns whether +other_service_id+ is right before - # us in the big scheme of things. + # Returns whether we are right after +other_service_id+ + # in the big scheme of things. # def right_after?(other_service_id) @officer.next_to?(other_service_id, service_id) end # - # Returns whether +other_service_id+ is right after - # us in the big scheme of things. + # Returns whether we are right before +other_service_id+ + # in the big scheme of things. # def right_before?(other_service_id) @officer.next_to?(service_id, other_service_id) @@ -228,6 +239,7 @@ # def redistribute_key(key) begin + @debug_callable.call("#{service_id}.redistribute_key(#{key}) called") if @debug_callable @officer.redistribute(key) rescue Archipelago::Sanitation::NotEnoughDataException => e # What shall we do in this case? @@ -250,14 +262,19 @@ # it was master to. # def lost_peer(record) + @debug_callable.call("#{service_id}.lost_peer(#{record[:service_id]}) called") if @debug_callable if right_before?(record[:service_id]) - @db.each_key(@officer.second_master_to(service_id)) do |key| + @debug_callable.call("#{service_id}.lost_peer(#{record[:service_id]}) is right_before, redistributing from #{@officer.second_master_to(service_id)}") if @debug_callable + @db.reverse_each_key do |key| redistribute_key(key) + break if key < @officer.second_master_to(service_id) end end if right_after?(record[:service_id]) - @db.reverse_each_key(record[:service_id]) do |key| + @debug_callable.call("#{service_id}.lost_peer(#{record[:service_id]}) is right_after, redistributing up to #{record[:service_id]}") if @debug_callable + @db.each_key do |key| redistribute_key(key) + break if key > record[:service_id] end end end Modified: trunk/archipelago/lib/archipelago/sanitation.rb =================================================================== --- trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/lib/archipelago/sanitation.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -16,7 +16,9 @@ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. require 'archipelago/client' +require 'archipelago/dump' require 'monitor' +require 'drb' module Archipelago @@ -107,8 +109,10 @@ nr_of_needed_chunks = @minimum_nr_of_chunks / @minimum_redundancy_ratio chunk_size = (super_string.size / nr_of_needed_chunks) + @metadata_overhead chunk_size = @minimum_recoverable_size / nr_of_needed_chunks if chunk_size < @minimum_recoverable_size / nr_of_needed_chunks - + + update_services! dump_hash = responsible_sites(key) + super_string.encode(8) dump_hash.t_each do |dump_id, nr_of_chunks_needed| @sites[dump_id][:service].insert!(key, (0...nr_of_chunks_needed).collect do |nr_of_chunks_needed| @@ -119,6 +123,7 @@ end def delete!(key) + update_services! dump_hash = responsible_sites(key) dump_hash.t_each do |dump_id, nr_of_chunks_available| @sites[dump_id][:service].delete!(key) @@ -203,6 +208,7 @@ # Returns [the value for +key+, the timestamp for the value]. # def fetch(key) + update_services! dump_hash = responsible_sites(key) dump_ids = dump_hash.keys newest_timestamp = "\000\000\000\000" @@ -212,20 +218,24 @@ dump_hash.t_each do |dump_id, nr_of_chunks_available| site = @sites[dump_id][:service] - chunks = site.fetch(key) - rval.mon_synchronize do - while !rval.decode_done? && chunks.size > 0 - t, data = chunks.shift - if t > newest_timestamp - rval = Oneliner::SuperString.new - rval.extend(MonitorMixin) - newest_timestamp = t + begin + chunks = site.fetch(key) + rval.mon_synchronize do + while chunks.size > 0 + t, data = chunks.shift + if t > newest_timestamp + rval = Oneliner::SuperString.new + rval.extend(MonitorMixin) + newest_timestamp = t + end + + if t == newest_timestamp + rval.decode!(data) + end end - - if t == newest_timestamp - rval.decode!(data) - end end + rescue DRb::DRbConnError => e + @sites.delete(dump_id) end end Modified: trunk/archipelago/script/console =================================================================== --- trunk/archipelago/script/console 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/script/console 2007-02-22 22:33:52 UTC (rev 219) @@ -1,3 +1,3 @@ #!/usr/bin/env ruby -exec 'irb', '-r', File.join(File.dirname(__FILE__), 'pirate.rb'), "-I", 'lib' +exec 'irb', '-r', File.join(File.dirname(__FILE__), 'pirate.rb'), '-r', File.join(File.dirname(__FILE__), 'officer.rb'), "-I", 'lib' Added: trunk/archipelago/script/officer.rb =================================================================== --- trunk/archipelago/script/officer.rb (rev 0) +++ trunk/archipelago/script/officer.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -0,0 +1,10 @@ +#!/usr/bin/env ruby + +require 'archipelago/sanitation' + +begin + DRb.uri +rescue DRb::DRbServerNotFound + DRb.start_service +end + at o = Archipelago::Sanitation::Officer.new Modified: trunk/archipelago/script/pirate.rb =================================================================== --- trunk/archipelago/script/pirate.rb 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/script/pirate.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -2,6 +2,10 @@ require 'archipelago/pirate' -DRb.start_service +begin + DRb.uri +rescue DRb::DRbServerNotFound + DRb.start_service +end @p = Archipelago::Pirate::Captain.new @p.evaluate!(File.join(File.dirname(__FILE__), 'overloads.rb')) Modified: trunk/archipelago/script/services.rb =================================================================== --- trunk/archipelago/script/services.rb 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/script/services.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -9,7 +9,7 @@ require 'archipelago/treasure' require 'archipelago/tranny' -require 'archipelago/cove' +require 'archipelago/dump' if ARGV.size > 1 DRb.start_service(ARGV[1]) @@ -22,9 +22,13 @@ t = Archipelago::Tranny::Manager.new(:persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.join(ARGV[0], "tranny")))) t.publish! +puts "published #{t.class} with id #{t.service_id}" c = Archipelago::Treasure::Chest.new(:persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.join(ARGV[0], "chest")))) c.publish! -tank = Archipelago::Cove::Tanker.new(:persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.join(ARGV[0], "tanker")))) -tank.publish! +puts "published #{c.class} with id #{c.service_id}" +d = Archipelago::Dump::Site.new(:debug_callable => Proc.new do |msg| puts msg end, + :persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(File.join(ARGV[0], "dump")))) +d.publish! +puts "published #{d.class} with id #{d.service_id}" DRb.thread.join Modified: trunk/archipelago/tests/disco_test.rb =================================================================== --- trunk/archipelago/tests/disco_test.rb 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/tests/disco_test.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -82,8 +82,8 @@ end assert(empty) - - @d1.publish(Archipelago::Disco::Record.new(:service_id => 1, + + @d1.publish(Archipelago::Disco::Record.new(:service_id => 100, :validator => Archipelago::Disco::MockValidator.new, :epa => "blar2")) Modified: trunk/archipelago/tests/sanitation_test.rb =================================================================== --- trunk/archipelago/tests/sanitation_test.rb 2007-02-22 22:07:09 UTC (rev 218) +++ trunk/archipelago/tests/sanitation_test.rb 2007-02-22 22:33:52 UTC (rev 219) @@ -44,30 +44,31 @@ :persistence_provider => Archipelago::Hashish::BerkeleyHashishProvider.new(Pathname.new(__FILE__).parent.join("master_test_#{n}.db"))) dumps[n].publish! end - assert_within(10) do + assert_within(30) do cleaner2.update_services! - s = cleaner2.instance_eval do @sites end - dumps.all? do |dump| - s.include?(dump.service_id) - end + dumps.collect do |d| d.service_id end.sort == cleaner2.sites.keys.sort end 10.times do |n| if n < 9 assert(dumps[n].instance_eval do right_before?(dumps[n+1].service_id) end, - "#{dumps[n].service_id} is supposed to be right_before? #{dumps[n+1].service_id}, but isnt") + "#{dumps[n].service_id} is supposed to be right_before? #{dumps[n+1].service_id}, but isnt. services are #{cleaner2.sites.keys.inspect}") assert(dumps[n+1].instance_eval do right_after?(dumps[n].service_id) end, - "#{dumps[n+1].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") + "#{dumps[n+1].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt. services are #{cleaner2.sites.keys.inspect}") else assert(dumps[n].instance_eval do right_before?(dumps[0].service_id) end, - "#{dumps[n].service_id} is supposed to be right_before? #{dumps[0].service_id}, but isnt") + "#{dumps[n].service_id} is supposed to be right_before? #{dumps[0].service_id}, but isnt. services are #{cleaner2.sites.keys.inspect}") assert(dumps[0].instance_eval do right_after?(dumps[n].service_id) end, - "#{dumps[0].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt") + "#{dumps[0].service_id} is supposed to be right_after? #{dumps[n].service_id}, but isnt. services are #{cleaner2.sites.keys.inspect}") end end s1 = Oneliner::SuperString.new("brappa") s2 = Oneliner::SuperString.new("brappa2") - dumps[0].insert!("a", s1.encode(20), "aaaa") - dumps[1].insert!("a", s2.encode(200), "aaab") + assert_equal(0, dumps[0].db.size) + assert_equal(0, dumps[0].db.size) + dumps[0].insert!("a", [s1.encode(20)], "aaaa") + dumps[1].insert!("a", [s2.encode(200)], "aaab") + assert(cleaner2.responsible_sites("a").include?(dumps[0].service_id)) + assert(cleaner2.responsible_sites("a").include?(dumps[1].service_id)) assert_equal("brappa2", cleaner2["a"]) assert_equal([dumps[9].service_id, dumps[0].service_id, dumps[1].service_id].sort, cleaner2.instance_eval do @@ -82,7 +83,8 @@ assert_equal(dumps[9].service_id, cleaner2.second_master_to(dumps[0].service_id)) ensure - dumps.each do |dump| + dumps.extend(Archipelago::Current::ThreadedCollection) + dumps.t_each do |dump| dump.stop! dump.instance_eval do @persistence_provider.unlink! end end