Opened 6 years ago

Closed 5 years ago

#4913 closed enhancement (fixed)

Add stem.util.conf.Config.save()

Reported by: gsathya Owned by: atagar
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Keywords:
Cc: gsathya.ceg@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Added a new save method that saves the current config contents into the configuration file by overwriting the old one.

  def save(self):
    self._contents_lock.acquire()

    with open(path, 'w') as f:
      for entry in self.keys():
        f.write('%s %s\n' % (entry, self.get(entry)))

    self._contents_lock.release()

If we want to retain the comments --

  def save(self):
    self._contents_lock.acquire()

    with open(path, 'w') as f:
      for line in self._raw_contents:
        key, value = ""
        comment_start = line.find("#")
        if comment_start != -1:
          comment = line[comment_start:]
          line = line[:comment_start]
        line = line.strip()
      
        if line:
          try:
            key, value = line.split(" ", 1)
          except ValueError:
            key, value = line, ""
    
        f.write("%s %s %s\n" % (key, self.get(key, value), comment)
      
    self._contents_lock.release()

Since we strip the lines, we don't really know what the formatting is like. This messes up the white spaces a bit.

Do we have an alternative? Or is one these fine?

Child Tickets

Change History (20)

comment:1 Changed 6 years ago by gsathya

Cc: gsathya.ceg@… added

comment:2 Changed 6 years ago by atagar

Status: newneeds_revision

with open(path, 'w') as f:

Ah nice, I've never run across the 'with' keyword before. I'm a little dubious of its general usefulness since it isn't clear what has the enter and exit methods (this is one case where a java style interface tag would be nice) but oh well. Files at least are the textbook example for using it. :)

Minor note in case you didn't know it: 'with' wasn't introduced until python 2.5. That's fine though since stem aims for 2.5+ compatibility (otherwise some of the ternary assignment statements I've been doing will break too).

Minor thing but please use another variable name, such as 'output_file'. Single letter variables are evil - they make the code ungrepable. For an amusing bit on this see 'Single Letter Variable Names' under...
http://thc.org/root/phun/unmaintain.html

I'm fine with using them for single line statements like...

foo = [str(i) for i in range(5)] # becomes the list ["0", "1", "2", "3", "4"]

or the classic 'i', 'j', 'k' for loop iteration ints, but for other blocks or anything else longer than a single line please use something longer. :)

for line in self._raw_contents:

Interesting, I hadn't thought of using _raw_contents for this. This would generally do the trick but has a few issues...

  • if we've used the "set" method to add new config entries then those will be missed
  • if we later allow config values to be unset then this will have issues
  • as you mentioned commenting whitespace is lost, which is kinda common for these key/value configs

While it would be nice if we had a magical function that merged the current config state with the user created config we loaded, there are too many land mines to dealing with the arbitrary user input. Lets go with the first implementation, though with the keys sorted.

Thanks! -Damian

comment:3 in reply to:  2 Changed 6 years ago by gsathya

Status: needs_revisionneeds_review

Replying to atagar:

Minor thing but please use another variable name, such as 'output_file'.

Changed.

  • if we've used the "set" method to add new config entries then those will be missed

We can fix(for some definition of fix) this by adding the unused_keys() at the end. But if the user has already requested some of the keys then this won't work.

Oh, this made me realize - I'm using

      for entry in sorted(self.iterkeys()):
        output_file.write('%s %s\n' % (entry, self.get(entry)))

which renders the unused_keys() irrelevant(all the keys would've been added to requested_keys after calling save() since we iterate over all the items in the contents) -- This is a stupid side effect. I can go around this by directly parsing self._contents, but is this necessary?

Lets go with the first implementation, though with the keys sorted.

Done

https://gitweb.torproject.org/user/gsathya/stem.git/shortlog/refs/heads/Config.save

comment:4 Changed 6 years ago by gsathya

Ok I think we need a different approach. Config.get() returns only the last updated value if I pass Multiple=False.
So in case we have a config file like -

      startup.run export PATH=$PATH:~/bin
      startup.run alias l=ls

We'd only be writing startup.run alias l=ls into the config file and startup.run export PATH=$PATH:~/bin would get pruned.

On the other hand, if we pass Multiple = True, we'd get a list of values which we dont need like login.password -> ["foo", "bar", "fubar"] . In this case we only need login.password -> ["foo"]

We can fix this by storing only multiple values and overwriting the old ones.

comment:5 Changed 6 years ago by atagar

Status: needs_reviewneeds_revision

I've pushed a couple of your changes (6c5f020 and 451e13e) with some very minor tweaks. Would you mind adding an integ test for your save function? I'd be happy to help if you're unsure what to add.

+ return self._contents.keys()

I'm gonna hazard the guess that this is testing code since the save() function is then a no-op. ;)

def iterkeys(self):

I'm not sure of the point of this function... why did you add it? When you call dict.keys() it's an iterable object... I suppose getting an iterator directly is slightly more efficient but probably not worth adding a new function.

On the other hand, if we pass Multiple = True, we'd get a list of values which we dont need like...

I'm not following, mind clarifying why you'd only want to save 'login.password -> foo?'?

comment:6 in reply to:  5 ; Changed 6 years ago by gsathya

Replying to atagar:

I've pushed a couple of your changes (6c5f020 and 451e13e) with some very minor tweaks.

Thanks!

Would you mind adding an integ test for your save function? I'd be happy to help if you're unsure what to add.

Ok I'll try.

+ return self._contents.keys()

I'm gonna hazard the guess that this is testing code since the save() function is then a no-op. ;)

I'm not sure I follow. Where is this from?

def iterkeys(self):

I'm not sure of the point of this function... why did you add it? When you call dict.keys() it's an iterable object... I suppose getting an iterator directly is slightly more efficient but probably not worth adding a new function.

Yeah, I wanted to do --

for entry in sorted(self.iterkeys()): 

instead of

config_keys = self.keys()
config_keys.sort() 

and iterating over

config_keys 

For some reason I thought you'd have to pass an iterable to sorted() and not just a list. Sorry.

On the other hand, if we pass Multiple = True, we'd get a list of values which we dont need like...

I'm not following, mind clarifying why you'd only want to save 'login.password -> foo?'?

So we have a config file that has

      login.password foo 

and we do the following

      user_config = stem.util.conf.get_config("login")
      user_config.load("/home/foo/myConfig")

      user_config.get("login.password")
      #"foo"
      
      user_config.set("login.password","bar")
      #login.password -> ["foo", "bar"]

      user_config.get("login.password")
      #"bar"

      user_config.save()
      #login.password bar

Now the config file has

     login.password bar

The above behavior is perfect. We only need to save "login.password bar", thus overwriting "login.password foo".

Now consider a config file with

      startup.run export PATH=$PATH:~/bin
      startup.run alias l=ls

and we do the following --

      user_config = stem.util.conf.get_config("startup")
      user_config.load("/home/foo/myConfig")

      user_config.get("startup.run")
      #"alias l=ls"

      user_config.save()
      #startup.run alias l=ls

Now, the config file only contains -

startup.run alias l=ls 

And

startup.run export PATH=$PATH:~/bin 

gets pruned.

We have deleted information from the config file. We shouldn't be doing this.

comment:7 in reply to:  6 Changed 6 years ago by gsathya

ACK! I just saw that you passed Multiple=True to Config.get_value(), in your latest commit. Now with the above example, when we call Config.save() we'd be storing

login.password foo bar

Right?

Now, if we were to read that file again

      user_config = stem.util.conf.get_config("relogin")
      user_config.load("/home/foo/myConfig")

      user_config.get("login.password")
      #The password becomes "foobar" which is wrong! 
      #This is because we strip out all the whitespaces 
      #after the first whitespace and store them as the 
      #value. 
      #key, value = line.split(" ", 1)

comment:8 Changed 6 years ago by gsathya

Is there a way to delete the comments? I just facepalm-ed after reading the previous comments. Please Ignore. I think we can close this ticket and save me from saying anything else?

comment:9 Changed 6 years ago by atagar

I'm not sure I follow. Where is this from?

First line after the pydocs in...
https://gitweb.torproject.org/user/gsathya/stem.git/commitdiff/4ba49e08ba1d4b07fbb3af87652cd11b3d9fdb9d

Yeah, I wanted to do --
for entry in sorted(self.iterkeys()):

Oh, good catch - we should use sorted() instead of sort in this case...

So we have a config file that has...

Ahh, I see. This is a problem with the set() method, not with save. In the example that you give the user is expecting the set() call to overwrite the configuration value rather than add a new one.

On reflection, what if we changed set() to...

def set(self, key, value, overwrite = True):
  """
  Sets the given key/value configuration mapping, behaving the same as if we'd
  loaded this from a configuration file.
  
  Arguments:
  key (str)           - key for the configuration mapping
  value (str or list) - value we're setting the mapping to
  overwrite (bool)    - replaces our current configuration mapping if True,
                        otherwise it's appended to the configuration
  """

This would allow the more intuitive usage that you're expecting.

ACK! I just saw that you passed Multiple=True to Config.get_value(), in your latest commit. Now with the above example, when we call Config.save() we'd be storing

login.password foo bar

Right?

Not quite, we'd store...

login.password foo
login.password bar

This seems like the right behavior if we *mean* for password to have multiple values. Again, if we wanted it to have a single value then that's more a problem with set().

Please Ignore. I think we can close this ticket and save me from saying anything else?

Don't worry about it, we're all human. :)

comment:10 Changed 6 years ago by atagar

with open(self._path, 'w') as f:

Oops, good catch. We still need to rename the 'f' variables too...

+# moving along..

Cute, though do we really want to push that? ;)

#Overwrite settings.

Very minor style things but...

  • please include a space between the hash and comment
  • if your comment is a single sentence then make it all lowercase without punctuation

examples...

# the following code will implode and destroy the universe

... or...

# This is a more sizable explanation of said universal doom. More
# specifically, the following code is gonna divide by zero ripping a tear in
# the space-time continuum... we think. Total galactic destruction has not yet
# been integration tested.

Each unit or integration test should be for a single use case of a single method to make testing failures easier to isolate. Test method names are of the form "test_<method>_<use case>". In this case you'd probably want the following new test methods...

# this test would...
# 1. make a config with the example config
# 2. save it to our integ data directory
# 3. clear the config
# 4. load the saved config
# 5. assert that our configuration has the example contents
# 6. remove the config we saved

test_save_example()

# similar to the above, but with an empty configuration file

test_save_empty()

# again similar, but with duplicate keys

test_save_multiple()

Just a suggestion - if you can think of better tests then go for it!

comment:11 in reply to:  10 Changed 6 years ago by gsathya

Replying to atagar:

I'm not sure I follow. Where is this from?

First line after the pydocs in...
https://gitweb.torproject.org/user/gsathya/stem.git/commitdiff/4ba49e08ba1d4b07fbb3af87652cd11b3d9fdb9d

Oops, Sorry!

Yeah, I wanted to do --
for entry in sorted(self.iterkeys()):

Oh, good catch - we should use sorted() instead of sort in this case...

https://gitweb.torproject.org/user/gsathya/stem.git/commit/56e76d1f206639d15d6e2de7e17732753debc112

def set(self, key, value, overwrite = True):

"""
Sets the given key/value configuration mapping, behaving the same as if we'd
loaded this from a configuration file.


Arguments:
key (str) - key for the configuration mapping
value (str or list) - value we're setting the mapping to
overwrite (bool) - replaces our current configuration mapping if True,

otherwise it's appended to the configuration

"""

This would allow the more intuitive usage that you're expecting.

+1 for this. I'll have a go at this.

ACK! I just saw that you passed Multiple=True to Config.get_value(), in your latest commit. Now with the above example, when we call Config.save() we'd be storing

login.password foo bar

Right?

Not quite, we'd store...

login.password foo
login.password bar

Yeah I realised this a little too late, hence the facepalm-ing.

with open(self._path, 'w') as f:

Oops, good catch. We still need to rename the 'f' variables too...

https://gitweb.torproject.org/user/gsathya/stem.git/commit/0479e058e644b89f7a6742e2672e3e0403f4a4ff

+# moving along..

Cute, though do we really want to push that? ;)

Heh :P

#Overwrite settings.

Very minor style things but...

  • please include a space between the hash and comment
  • if your comment is a single sentence then make it all lowercase without punctuation

examples...

# the following code will implode and destroy the universe

... or...

# This is a more sizable explanation of said universal doom. More
# specifically, the following code is gonna divide by zero ripping a tear in
# the space-time continuum... we think. Total galactic destruction has not yet
# been integration tested.

Aha, got it. Thanks for explaining :)

Each unit or integration test should be for a single use case of a single method to make testing failures easier to isolate. Test method names are of the form "test_<method>_<use case>". In this case you'd probably want the following new test methods...

# this test would...
# 1. make a config with the example config
# 2. save it to our integ data directory
# 3. clear the config
# 4. load the saved config
# 5. assert that our configuration has the example contents
# 6. remove the config we saved

test_save_example()

Pushed.

# similar to the above, but with an empty configuration file

test_save_empty()

Pushed.

Will try to push more tests, though I won't be committing much this week, since my exams start tomorrow. Also, I haven't added the pydoc strings to the pushed tests. Will do that soon.

https://gitweb.torproject.org/user/gsathya/stem.git/shortlog/refs/heads/integ_test_conf.save_v2

comment:12 Changed 6 years ago by atagar

Abiding by the coding guidelines of our benevolent dictator

Heh

commit 6185b825701025ed7ca371bedb3f0d2b1b7e9697

  • def test_example(self):

+ def test_save_example(self):

Please leave the current test_example() test alone and instead just add a new one. That test is to confirm that the example in the pydoc actually works.

That said, feel free to refactor if there's commonalities and it stays faithful to the what's in the doc for the Config class.

+ def makeConf(self, path, content):
+ self.tearDown()
+ with open(path, 'w') as output_file:
+ output_file.write(content)

Hmm, few nit picks...

  • Method and function names use the underscore convention (in this case "make_conf"). I did camel case like that in arm but switched for stem because underscores are both more readable and the 'official' coding contention in python.

From "http://www.python.org/dev/peps/pep-0008/"...
"Function names should be lowercase, with words separated by underscores as necessary to improve readability."

Also, python's convention is that private functions start with an underscore so this should actually be "_make_conf".

  • This really shouldn't be calling tearDown() directly - both setUp() and tearDown() are methods of the unit test class that are executed before and after the tests respectively. And yes, I realize that they don't follow the underscore convention. ;)

commit e154c99311b1a3a05bd6b633d6d4b3b7823e9487
+ user_config.save()
+ user_config.clear()
+ user_config.load(CONF_PATH)

Great if it didn't disrupt the test_example(). The way I usually approach writing tests is to first write them as stand-alone functions (even if this duplicates code), then refactor out repetitive bits afterward. In testing code repetitions are a bit more acceptable as long as the tests are readable.

+ def test_save_empty(self):

Hmm, this looks more like a test_save_example test. What I meant by 'save empty' was to call save with an empty configuration, like...

user_config = stem.util.conf.get_config("integ-save_empty")
user_config.save(test_path)

# assert that file exists and that it's empty

user_config.load(test_path)

# assert that user_config is still empty

Speaking of which, the save() function should probably accept an optional path (and if not provided use self._path).

+ self.tearDown()

You don't need to call tearDown() explicitly - it's automatically called after the test.

commit 56e76d1f206639d15d6e2de7e17732753debc112
+ for entry_key in sorted(self.keys()):

Perfect

Cheers! -Damian

comment:14 Changed 6 years ago by gsathya

Status: needs_reviewneeds_revision

Minor update:
Previously the single line config entries were converted to a multi line structure and then saved. Now, the structure of the config file is preserved.

https://gitweb.torproject.org/user/gsathya/stem.git/commitdiff/dd29415f27eb509ca496bc2159258a3a6b5d7774?hp=3f51e6920c1dc6dc28d7f70115cd30ed8c020b29

comment:15 Changed 6 years ago by atagar

Resolution: fixed
Status: needs_revisionclosed

Thanks Sathyanarayanan! I've merged your changes with a few edits...
https://gitweb.torproject.org/stem.git/commitdiff/e3b78e898dcfd4d33356c7e791fb92e6c7128484

The save() method looks done to me so resolving. Feel free to reopen if there's anything you'd like to change or discuss further.

Cheers! -Damian

comment:16 Changed 6 years ago by gsathya

Yay! Thanks!

comment:17 Changed 5 years ago by atagar

Resolution: fixed
Status: closedreopened

Hi Sathyanarayanan. After replying to Ravi's proposal on tor-dev@ Sebastian pointed out that stem's license will pose a problem for code sharing with other tor projects. I've added a note explaining it to...
https://trac.torproject.org/projects/tor/wiki/doc/stem#CopyrightforPatches

I'm checking with Wendy to see if this is the right interpretation, but working from that assumption a couple questions...

  1. Are you alright with this patch being a joint work with me (ie, sharing the copyright)?
  2. Are you alright with your future stem submissions being joint works so I don't need to keep asking?

Thanks! -Damian

comment:18 in reply to:  17 Changed 5 years ago by gsathya

Resolution: fixed
Status: reopenedclosed
  1. Are you alright with this patch being a joint work with me (ie, sharing the copyright)?

Yes

  1. Are you alright with your future stem submissions being joint works so I don't need to keep asking?

Yes

comment:19 Changed 5 years ago by atagar

Resolution: fixed
Status: closedreopened

Hi Sathyanarayanan. Sorry about reopening again. I checked with Wendy (a law professor and director with the tor project) and she suggested asking for patches to be within the public domain instead. Is that alright with you?

comment:20 in reply to:  19 Changed 5 years ago by gsathya

Resolution: fixed
Status: reopenedclosed

Replying to atagar:

Hi Sathyanarayanan. Sorry about reopening again. I checked with Wendy (a law professor and director with the tor project) and she suggested asking for patches to be within the public domain instead. Is that alright with you?

Yes

Note: See TracTickets for help on using tickets.