Inkscape 0.47 Development Plans

Hi all,
With Inkscape 0.46 wrapping up, it's time to look forward to our next release, 0.47, and our plans for its development.
When we started Inkscape, we began with a codebase with lots of potential but with some architectural limitations that we've never quite resolved. Inkscape has grown rapidly, especially thanks to Google's Summer of Code program. Unfortunately, while we've gained a lot of new features, it hasn't addressed the underlying issues - and in some cases has exposed new problems.
Inkscape's also been extremely successful at gaining a lot of contributors, yet this comes with a price: Stylistic differences, accidental code duplication, unfinished code, obsoleted code, etc.
What will the codebase cleanup work entail? The work will range from straightforward "grunt" work like making some simple code changes to all files in the codebase, to meatier work like abstracting widely used code into a more concise and powerful algorithm, to advanced work such as extracting distinct code into independent code packages.
To boil this down into five high level objectives:
0. Complete some of the big architectural refactoring efforts 1. Reduce source code line count 2. Break useful code out into stand-alone libraries 3. Increase code stylistic consistency 4. Make the codebase more convenient to code in
Now, architectural reworkings can often risk incur massive breakages since fundamental pieces of the code are being changed. In order to minimize this, I'd like to suggest the following principles:
* Always keep the tree buildable * Do major refactorings in small steps * Hold code review parties with 2-3 others to brainstorm * Drop copied-in codebases in favor of external dependencies * Make sure every function you touch has some doxygen comments
Further, this kind of work can go on indefinitely without a clear stopping point, so I think for this release we should use a schedule with a date-based stopping point. This will help everyone know how they should time their work.
Mar 10 Release 0.46. 0.47 Refactoring / Cleanup work begins Apr May Jun Jul 1 Completion of refactoring. Focus on Bug Fixing begins. Open 0.48 development tree early, for GSoC work. Aug Put out 0.47-pre releases. Sep Release 0.47.
For reference, here are some key GSoC dates:
May 26 GSoC work begins. Jul 14 GSoC midterm. First delivery of GSoC code Aug 18 GSoC work ends.
This schedule permits us to focus exclusively on refactoring for several months, with a due date of July 1st to complete it. It uses a very early branch point, where we'll split into a stable branch for doing bug fix and release work, and a development branch for the GSoC students to use and for folks to continue right on with refactoring projects if they wish.
Bryce

Forgot to mention, I've also updated the Roadmap, and listed some more specific tasks:
http://wiki.inkscape.org/wiki/index.php/Roadmap
On Thu, Mar 13, 2008 at 10:54:00PM -0700, Bryce Harrington wrote:
Hi all,
With Inkscape 0.46 wrapping up, it's time to look forward to our next release, 0.47, and our plans for its development.
When we started Inkscape, we began with a codebase with lots of potential but with some architectural limitations that we've never quite resolved. Inkscape has grown rapidly, especially thanks to Google's Summer of Code program. Unfortunately, while we've gained a lot of new features, it hasn't addressed the underlying issues - and in some cases has exposed new problems.
Inkscape's also been extremely successful at gaining a lot of contributors, yet this comes with a price: Stylistic differences, accidental code duplication, unfinished code, obsoleted code, etc.
What will the codebase cleanup work entail? The work will range from straightforward "grunt" work like making some simple code changes to all files in the codebase, to meatier work like abstracting widely used code into a more concise and powerful algorithm, to advanced work such as extracting distinct code into independent code packages.
To boil this down into five high level objectives:
- Complete some of the big architectural refactoring efforts
- Reduce source code line count
- Break useful code out into stand-alone libraries
- Increase code stylistic consistency
- Make the codebase more convenient to code in
Now, architectural reworkings can often risk incur massive breakages since fundamental pieces of the code are being changed. In order to minimize this, I'd like to suggest the following principles:
- Always keep the tree buildable
- Do major refactorings in small steps
- Hold code review parties with 2-3 others to brainstorm
- Drop copied-in codebases in favor of external dependencies
- Make sure every function you touch has some doxygen comments
Further, this kind of work can go on indefinitely without a clear stopping point, so I think for this release we should use a schedule with a date-based stopping point. This will help everyone know how they should time their work.
Mar 10 Release 0.46. 0.47 Refactoring / Cleanup work begins Apr May Jun Jul 1 Completion of refactoring. Focus on Bug Fixing begins. Open 0.48 development tree early, for GSoC work. Aug Put out 0.47-pre releases. Sep Release 0.47.
For reference, here are some key GSoC dates:
May 26 GSoC work begins. Jul 14 GSoC midterm. First delivery of GSoC code Aug 18 GSoC work ends.
This schedule permits us to focus exclusively on refactoring for several months, with a due date of July 1st to complete it. It uses a very early branch point, where we'll split into a stable branch for doing bug fix and release work, and a development branch for the GSoC students to use and for folks to continue right on with refactoring projects if they wish.
Bryce

Bryce Harrington schrieb:
Forgot to mention, I've also updated the Roadmap, and listed some more specific tasks:
Regarding the C++-ification of SPObjects, what is the best way to proceed? As I understand it, it is not really possible (or feasible) to mix GObjects and true C++ objects so I suppose that we would need to first switch "under the hood" and then get rid of the GObject framework in one go, right?
So for each SPObject a corresponding C++ class needs to be created which encapsulates its functionality, and all the SPObject "methods" would be changed to only call the corresponding ones of the new class (as seems to already be the case in sp-item-group.cpp). Is that correct or am I getting something completely wrong here? Should we keep a list on the wiki for which SPObjects this conversion has already be done so that we can keep track of the progress?
Max

On Fri, 2008-03-14 at 13:57 +0100, Maximilian Albert wrote:
So for each SPObject a corresponding C++ class needs to be created which encapsulates its functionality, and all the SPObject "methods" would be changed to only call the corresponding ones of the new class (as seems to already be the case in sp-item-group.cpp). Is that correct or am I getting something completely wrong here? Should we keep a list on the wiki for which SPObjects this conversion has already be done so that we can keep track of the progress?
I'm not happy with the approach that was taken with CGroup -- I think it's an unmaintainable mess, and we'd be unlikely to ever finish the migration if we went that route.
My preferred approach would be to replace GObject with NRObject in the SPObject hierarchy, removing the inter-SPObject refcounting (which would create unbreakable GC cycles) and explicit member initialization (no longer needed, as NRObject calls the C++ constructor/destructors). At that point we can start using all C++ features directly in the existing SPObject classes.
It's a big hurdle to get over initially, but it's one shot, and once we do it we can start improving the design and adopting native C++ features organically and incrementally.
-mental

On Sat, 2008-03-15 at 16:58 -0400, MenTaLguY wrote:
My preferred approach would be to replace GObject with NRObject in the SPObject hierarchy, removing the inter-SPObject refcounting (which would create unbreakable GC cycles) and explicit member initialization (no longer needed, as NRObject calls the C++ constructor/destructors). At that point we can start using all C++ features directly in the existing SPObject classes.
Actually, I think it may be best if I move the GC stuff into e.g. a distinct NRGCObject subclass, so that we don't have to do the GC migration things at the same time as everything else.
-mental

On Friday, March 14, 2008, 7:34:53 AM, Bryce wrote:
BH> Forgot to mention, I've also updated the Roadmap, and listed some more BH> specific tasks:
BH> http://wiki.inkscape.org/wiki/index.php/Roadmap
In milestones 15 and 16, which seem to be focussed on mobile, multipage is listed.
Does that date from when the SVG WG was trying to use the same page construct for actual printed pages and also for animation scenes? (We no longer do this). SVG Tiny 1.2 does not have multipage.
However, it does have a bunch of other features that would be interesting to author - video, audio, discard, the animation element, and so forth.
Is anyone else interested in working on a list of SVG Tiny 1.2 features that go beyond SVG 1.1, from an authoring perspective, and discussing what authoring challenges each feature might present?
Same question, but this time for authoring SVG Print documents. In case the answer to my first question was "no, it means multiple pages like in SVG Print".

On Fri, Mar 14, 2008 at 02:40:45PM +0100, Chris Lilley wrote:
On Friday, March 14, 2008, 7:34:53 AM, Bryce wrote:
BH> Forgot to mention, I've also updated the Roadmap, and listed some more BH> specific tasks:
BH> http://wiki.inkscape.org/wiki/index.php/Roadmap
In milestones 15 and 16, which seem to be focussed on mobile, multipage is listed.
Does that date from when the SVG WG was trying to use the same page construct for actual printed pages and also for animation scenes? (We no longer do this). SVG Tiny 1.2 does not have multipage.
No, it's just an oft-requested feature, and coincidental that it got listed there.
Note that while our milestones "declare" a focus, in reality much of Inkscape development is parallel, and strongly driven by the developer interest. So we may say that Foo feature is scheduled for 0.49, but if no one is interested in working on it, it might get pushed back to ever later releases (which is what has happened with both animation and multipage.) On the other hand, other features which are of interest, often get done earlier.
So, it's best to think of the Roadmap not as a work plan, but more of a weather forecast.
Bryce

On Mar 13, 2008, at 10:54 PM, Bryce Harrington wrote:
Now, architectural reworkings can often risk incur massive breakages since fundamental pieces of the code are being changed. In order to minimize this, I'd like to suggest the following principles:
- Always keep the tree buildable
- Do major refactorings in small steps
- Hold code review parties with 2-3 others to brainstorm
- Drop copied-in codebases in favor of external dependencies
- Make sure every function you touch has some doxygen comments
I think you missed one key point.
We need to get the unit test passing and updated.
Although some fails are not too hard to fix, but there are a few that are probably critical. I recall that at least one is due to one of our core string-enum-string round trips is failing. For that one we might need to change a core assumption and approach.
I think that getting them working and keeping them working is probably only second to "Always keep the tree buildable". If we can have unit tests working and added to, then most of the other problems can be reduced.
One other way to describe this is...
Before we start climbing up the front of the building to start tearing down the facade, let's make sure we have a scaffolding up and bolted solidly together instead of just leaning out from the window ledges.
:-)

Jon A. Cruz wrote:
We need to get the unit test passing and updated.
Is there an overview of the test framework available somewhere? How does it work? What is already there, what needs to be updated or newly implemented and how can this be done? Sorry if these are dumb questions (I couldn't find anything on the wiki) but apparently I haven't yet got in touch with the testing stuff but would be interested in having a look at it (if only to be able to have it in mind during future coding).
Max

On Fri, Mar 14, 2008 at 01:23:29PM +0100, Maximilian Albert wrote:
Jon A. Cruz wrote:
We need to get the unit test passing and updated.
Is there an overview of the test framework available somewhere? How does it work? What is already there, what needs to be updated or newly implemented and how can this be done? Sorry if these are dumb questions (I couldn't find anything on the wiki) but apparently I haven't yet got in touch with the testing stuff but would be interested in having a look at it (if only to be able to have it in mind during future coding).
There isn't an overview in Wiki, but you can see the code itself in inkscape/cxxtest, which includes a simple user's guide in the README. You can also see examples in inkscape/src/round-test.h, extract-uri-test.h, etc.
Note there is also a src/utest/, which is used by attributes-test.h and style-test.cpp. It seems to me that it's redundant to have two different unit test systems in the code. cxxtest seems to be more widely used; perhaps the tests currently using utest should be converted to cxxtest, and utest dropped.
It would also be good to create a Wiki page about the test framework (with a pointer to cxxtest's README for details) in case others go looking for it. Maximilian, would you mind creating this?
Bryce

On Thu, Mar 13, 2008 at 11:53:00PM -0700, Jon A. Cruz wrote:
We need to get the unit test passing and updated.
Agreed; I've added this: http://wiki.inkscape.org/wiki/index.php/0.47_Refactoring_Plan
Bryce

On Thu, 2008-03-13 at 23:53 -0700, Jon A. Cruz wrote:
We need to get the unit test passing and updated.
We also need to break up dependencies and stub/mock things as needed so that individual test suites don't take 45 minutes to build.
As long as new test runs take as long as they currently do, I don't think anyone's going to be seriously using the unit tests for anything.
-mental

Quoting MenTaLguY <mental@...3...>:
On Thu, 2008-03-13 at 23:53 -0700, Jon A. Cruz wrote:
We need to get the unit test passing and updated.
We also need to break up dependencies and stub/mock things as needed so that individual test suites don't take 45 minutes to build.
As long as new test runs take as long as they currently do, I don't think anyone's going to be seriously using the unit tests for anything.
Sounds like there is some debugging to do on the process. I don't see anything near that on my system.
If I go into the src directory, ensure my inkscape compile is up to date, then fire off a check, it goes very quickly:
real 0m42.900s user 0m34.698s sys 0m3.588s
So I'm seeing 43 seconds to build and execute the base unit tests, but you're seeing 45 minutes. Something is seriously different.
Perhaps we need to just collect this on a "Unit Test Pain" page so we can fix it for people seeing bad behavior.

On Sat, 2008-03-15 at 22:05 +0000, jon@...18... wrote:
Sounds like there is some debugging to do on the process. I don't see anything near that on my system.
If I go into the src directory, ensure my inkscape compile is up to date, then fire off a check, it goes very quickly:
real 0m42.900s user 0m34.698s sys 0m3.588s
So I'm seeing 43 seconds to build and execute the base unit tests, but you're seeing 45 minutes. Something is seriously different.
It's been a while since I've looked at the tests -- it sounds like this problem may already be fixed?
-mental

On 2008-March-14 , at 06:54 , Bryce Harrington wrote:
[...] Now, architectural reworkings can often risk incur massive breakages since fundamental pieces of the code are being changed. In order to minimize this, I'd like to suggest the following principles:
- Always keep the tree buildable
- Do major refactorings in small steps
- Hold code review parties with 2-3 others to brainstorm
- Drop copied-in codebases in favor of external dependencies
- Make sure every function you touch has some doxygen comments
[...] This schedule permits us to focus exclusively on refactoring for several months, with a due date of July 1st to complete it. It uses a very early branch point, where we'll split into a stable branch for doing bug fix and release work, and a development branch for the GSoC students to use and for folks to continue right on with refactoring projects if they wish.
Since this supposes a lot of branching (for SoC, probably also for many large scale changes because the tree still needs to be usable during that time etc.) wouldn't it be a good time to change version control system to something that eases branching and merging? Plus, distributed version control would ease a model where changes are reviewed by several people before they are committed to the main tree (i.e. just pull the changes from other people's trees and allow commits to the main tree only by the reviewers). The discussions on the list about this are old and quite long and to summarize them in a few words: - everybody agreed that there are solutions superior to svn in all point of views - the contenders are git, mercurial and bazaar - all of them are pretty equivalent in terms of basic functionality now - compare to basic svn usage (update, status, diff, commit, log) all of them should be close enough so that they don't demand much learning by developers used to only svn - in all of them it should be possible to hook a check mechanism after each commit (which would probably prove very useful in the context of re-factoring to check the changes for 'make check' passing or other custom test suite)
Now for the specific benefits/problems associated which each one (I know git a little, not the others, so please fill in the blanks and correct my mistakes so that we can come up with a clear list to decide from (or not ;) ):
-- GIT ---------------------------- Good - very fast (but anything would be faster than svn anyway) - allows to stash away local changes to fix a small issue. probably useful when, in the middle of a large code change, one notices a local bug which does not have anything to do with the change (not sure this is git-only) - ability to follow chunk of codes around, without relying on file names. Probably useful from a refactoring point of view - probably the fastest growing user base => many tools
Bad - does not run too well on windows
-- HG ----------------------------- Good
Bad
-- BZR ---------------------------- Good - already integrated in launchpad
Bad
JiHO --- http://jo.irisson.free.fr/

jiho wrote:
Since this supposes a lot of branching (for SoC, probably also for many large scale changes because the tree still needs to be usable during that time etc.) wouldn't it be a good time to change version control system to something that eases branching and merging?
Excellent you bring this up and thanks for the succinct summary of the discussion so far. I wholeheartedly agree that especially for the purposes of refactoring switching version control systems would prove extremely valuable.
Of the three alternatives you mentioned I also know only git so I can't make substantiated comments on which of them would be best for our purposes. But from what I've seen I guess I'd be happy with any of them (although of course I wouldn't be disappointed if it were git :)).
Max

I'm still a little uncertain about this. Git seems increasingly popular, and very powerful, it just doesn't seem very obvious to use for new users. SVN is great at the moment, because there's a great GUI client for windows (TortoiseSVN), and quite a good one for linux (RapidSVN). Having easy source control clients seems like an easy way to get people involved. So I'm wondering: are there any good GUI clients for git, mercurial or bazaar?
On Fri, 2008-03-14 at 13:03 +0100, Maximilian Albert wrote:
jiho wrote:
Since this supposes a lot of branching (for SoC, probably also for many large scale changes because the tree still needs to be usable during that time etc.) wouldn't it be a good time to change version control system to something that eases branching and merging?
Excellent you bring this up and thanks for the succinct summary of the discussion so far. I wholeheartedly agree that especially for the purposes of refactoring switching version control systems would prove extremely valuable.
Of the three alternatives you mentioned I also know only git so I can't make substantiated comments on which of them would be best for our purposes. But from what I've seen I guess I'd be happy with any of them (although of course I wouldn't be disappointed if it were git :)).
Max
This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel

jiho wrote:
Since this supposes a lot of branching (for SoC, probably also for many large scale changes because the tree still needs to be usable during that time etc.) wouldn't it be a good time to change version control system to something that eases branching and merging? Plus, distributed version control would ease a model where changes are reviewed by several people before they are committed to the main tree (i.e. just pull the changes from other people's trees and allow commits to the main tree only by the reviewers). The discussions on the list about this are old and quite long and to summarize them in a few words:
- everybody agreed that there are solutions superior to svn in all
point of views
- the contenders are git, mercurial and bazaar
- all of them are pretty equivalent in terms of basic functionality now
- compare to basic svn usage (update, status, diff, commit, log) all
of them should be close enough so that they don't demand much learning by developers used to only svn
- in all of them it should be possible to hook a check mechanism after
each commit (which would probably prove very useful in the context of re-factoring to check the changes for 'make check' passing or other custom test suite)
Now for the specific benefits/problems associated which each one (I know git a little, not the others, so please fill in the blanks and correct my mistakes so that we can come up with a clear list to decide from (or not ;) ):
I think this would be a great time to encourage DVCS usage among the developers. But until a number of us have some experience from which to speak about the products, making a decision to use one would make me nervous.
As you may know a few people are already using DVCS systems to do their work on inkscape. Ted uses SVK (you can tell by the number of times that his commit messages say "SVK screwed up the last commit" :-) ). Mental uses git-svn (he described his use of git-svn for jruby at http://moonbase.rydia.net/mental/blog/programming/using-git-svn-for-jruby). I've heard that a similar workflow is possible with bzr (http://bazaar-vcs.org/BzrForeignBranches/Subversion). I have not heard anything of hg but I would expect something similar. We can try these tools now, without switching over the entire repo.
Having said that perhaps there is some worth in switching now.
-- GIT ---------------------------- Good
- very fast (but anything would be faster than svn anyway)
- allows to stash away local changes to fix a small issue. probably
useful when, in the middle of a large code change, one notices a local bug which does not have anything to do with the change (not sure this is git-only)
- ability to follow chunk of codes around, without relying on file
names. Probably useful from a refactoring point of view
- probably the fastest growing user base => many tools
Bad
- does not run too well on windows
-- HG ----------------------------- Good
Bad
-- BZR ---------------------------- Good
- already integrated in launchpad
- supports renames (could be considered a Bad by git users) - supports bundling changesets -
Bad
Aaron

Aaron Spike wrote:
-- GIT ---------------------------- Bad
- does not run too well on windows
I'm not sure this is fair to say anymore. I've used git to branch a few repos onto a winxp box without issue. I haven't used it for much work at all but the simple features seemed to work at first glance. I would encourage some of the other win32 devs to give it a whirl.
http://code.google.com/p/msysgit/
And maybe someone would be interested enough to put a little work into:
http://repo.or.cz/w/git-cheetah.git/
Aaron Spike

Aaron Spike wrote:
Aaron Spike wrote:
-- GIT ---------------------------- Bad
- does not run too well on windows
I'm not sure this is fair to say anymore. I've used git to branch a few repos onto a winxp box without issue. I haven't used it for much work at all but the simple features seemed to work at first glance. I would encourage some of the other win32 devs to give it a whirl.
Apparently there is some progress on git-svn on windows too!
http://groups.google.com/group/msysgit/browse_thread/thread/4fe38865bdc6a862...
Aaron

On 2008-March-14 , at 13:15 , Aaron Spike wrote:
Aaron Spike wrote:
-- GIT ---------------------------- Bad
- does not run too well on windows
I'm not sure this is fair to say anymore. I've used git to branch a few repos onto a winxp box without issue. I haven't used it for much work at all but the simple features seemed to work at first glance. I would encourage some of the other win32 devs to give it a whirl.
http://code.google.com/p/msysgit/
And maybe someone would be interested enough to put a little work into:
That's good news (I, too, "wouldn't mind" if git was chosen ;)). Is it easy to install and all or is it just "grab the source and compile it yourself" for now?
JiHO --- http://jo.irisson.free.fr/

On 2008-March-14 , at 13:34 , jiho wrote:
On 2008-March-14 , at 13:15 , Aaron Spike wrote:
Aaron Spike wrote:
-- GIT ---------------------------- Bad
- does not run too well on windows
I'm not sure this is fair to say anymore. I've used git to branch a few repos onto a winxp box without issue. I haven't used it for much work at all but the simple features seemed to work at first glance. I would encourage some of the other win32 devs to give it a whirl.
http://code.google.com/p/msysgit/
And maybe someone would be interested enough to put a little work into:
That's good news (I, too, "wouldn't mind" if git was chosen ;)). Is it easy to install and all or is it just "grab the source and compile it yourself" for now?
don't bother. I should have clicked through a little bit. there's an installer.
JiHO --- http://jo.irisson.free.fr/

jiho wrote:
Since this supposes a lot of branching (for SoC, probably also for many large scale changes because the tree still needs to be usable during that time etc.) wouldn't it be a good time to change version control system to something that eases branching and merging?
I branched and merged a lot for GSoC 2007. Merging and branching *is* easy in SVN.
jiho wrote:
Plus, distributed version control would ease a model where changes are reviewed by several people before they are committed to the main tree (i.e. just pull the changes from other people's trees and allow commits to the main tree only by the reviewers).
I am opposed to such a working model. Reviewing would be a fulltime job, and not that much of fun (count me out!). Who will have enough knowledge of allll of inkscape anyway? One of the nice things about TortoiseSVN: when you do an SVN update, you can press the view log button that then shows *only* the log of revisions since the last svn update. This way it is very easy to keep track of what changed recently. Since the log UI of tortoisesvn is quite nice, you can see which files changed click on it to see the diff. Makes for easy reviewing when one wants it (I often check files of which I know the code), but does not demand reviewing.
I am very much against doing offline 'commits', just from what I see happening in practice. Although it is nice to be able to 'save' work when there is no internet connection available, it removes an *essential* part from the workflow: SVN Update and checking whether it still works, possibly checking source code changes. I'm not sure whether everybody does this: 0. svn update 1. code stuff 2. svn update (!!!) 3. make 4. does it work? did merging go well? 5. svn commit
2-5 in rapid succession and needs internet. This at least ensures that we don't get ';' missing buildbreakage etc, but also removes more complex problems. I think any offline CVS method removes 2 to 5.
Regards, Johan

J.B.C.Engelen@...1578... wrote:
jiho wrote:
Since this supposes a lot of branching (for SoC, probably also for many large scale changes because the tree still needs to be usable during that time etc.) wouldn't it be a good time to change version control system to something that eases branching and merging?
I branched and merged a lot for GSoC 2007. Merging and branching *is* easy in SVN.
It is easy if you keep the proper workflow. But you still have to know something. If you haven't done any work with a DVCS system you should try one out at home with one of your pet projects for a week and see how "easy" branching and merging really can be. In DVCS the system knows which revisions are where so you don't have to keep track, you simply tell it to merge this way or that and it does. No mental effort.
jiho wrote:
Plus, distributed version control would ease a model where changes are reviewed by several people before they are committed to the main tree (i.e. just pull the changes from other people's trees and allow commits to the main tree only by the reviewers).
I am opposed to such a working model. Reviewing would be a fulltime job, and not that much of fun (count me out!). Who will have enough knowledge of allll of inkscape anyway?
I think I agree here. In the Inkscape project we don't follow such a model. We are very open with committer access and that is one of the reasons for the project's success.
One of the nice things about TortoiseSVN: when you do an SVN update, you can press the view log button that then shows *only* the log of revisions since the last svn update. This way it is very easy to keep track of what changed recently. Since the log UI of tortoisesvn is quite nice, you can see which files changed click on it to see the diff. Makes for easy reviewing when one wants it (I often check files of which I know the code), but does not demand reviewing.
I think the workflow for a DVCS is a bit different. For example I keep one branch to track the state of the main repo and another branch for each task I'm currently working on (branches rather than separate checkouts). Now I can update the state of the tracking branch as often as I want (independently of my working branch, like you are used to in svn). And because the system knows which revisions are in which branches I can check the log at anytime and get just the new commits by comparing the commits on each branch. In git the command for this is called `git whatchanged`. Also one of the things I noticed is that any use of the log functionality in TortoiseSVN was unbearably slow because it requires network access. A DVCS stores all this information locally, so the logs become much more useful and accessible. I think you'll appreciate this.
I am very much against doing offline 'commits', just from what I see happening in practice. Although it is nice to be able to 'save' work when there is no internet connection available, it removes an *essential* part from the workflow: SVN Update and checking whether it still works, possibly checking source code changes. I'm not sure whether everybody does this: 0. svn update
- code stuff
- svn update (!!!)
- make
- does it work? did merging go well?
- svn commit
2-5 in rapid succession and needs internet. This at least ensures that we don't get ';' missing buildbreakage etc, but also removes more complex problems. I think any offline CVS method removes 2 to 5.
I really don't understand why you think that a DVCS would cause people to practice any worse VCS hygiene than now. DVCS gives you freedom to commit in smaller logical blocks. As a person who worked with branches in SVN extensively you should understand the utility of this. You likely committed much more frequently on your branch than you would have committed to trunk. The diffs were smaller and easier to read. The commit messages were more granular and more meaningful. And when working on a branch svn update is equivalent to merge. You didn't merge for every commit. You had the freedom to work out an idea while committing bite-sized pieces and dealing with the integration problems when you chose to. But you still dealt with them. As things are now many people keep large sets of changes in their local tree and work, effectively, without the help of version control day to day. Bulia, for one, likes to drop large changesets on us. I think he will benefit tremendously from DVCS. Think of it as a unification of team VCS and your very own personal VCS. A work flow might look like this:
0. check out HEAD branch from central repo 1. branch "topic" 2. code a bit, test, commit. 3. code a bit, test, commit. ... 4. fetch changes from central repo into HEAD 5. merge HEAD into "topic" 6. fix, test, commit 7. code a bit, test, commit. 8. code a bit, test, commit. **finished new functionality or refactoring** 9. fetch changes from central repo into HEAD 10. merge HEAD into "topic" 11. fix, test, commit 12. push topic changes to central repo

On 2008-March-17 , at 13:26 , Aaron Spike wrote:
jiho wrote:
Plus, distributed version control would ease a model where changes are reviewed by several people before they are committed to the main tree (i.e. just pull the changes from other people's trees and allow commits to the main tree only by the reviewers).
I am opposed to such a working model. Reviewing would be a fulltime job, and not that much of fun (count me out!). Who will have enough knowledge of allll of inkscape anyway?
I think I agree here. In the Inkscape project we don't follow such a model. We are very open with committer access and that is one of the reasons for the project's success.
Just to clarify about this. I know that this is not Inkscape's current workflow and probably should not become it in the future either (if I understood things well, Sodipodi used such a kind of hierarchical workflow and this was not really appreciated by the folks now working on Inkscape ;) ). I mentioned this here in the context of refactoring, where Bryce suggested: [...] I'd like to suggest the following principles: ... * Hold code review parties with 2-3 others to brainstorm ... I just wanted to mention that a DVCS could ease that part very much by allowing a few people to pull from others and then one of the reviewers to push that to the central repo once the change set is done. Maybe my words were a bit strong in saying that committing to the central repo would be restricted to only a few people. The workflow was really what I wanted to point out.
JiHO --- http://jo.irisson.free.fr/

Since the thread is becoming long and has many forks (making it difficult to follow) I made a wiki page where I tried to summarize what was said here: http://wiki.inkscape.org/wiki/index.php/Change_VCS It would benefit from review by other people who have followed the thread. In particular adding a proper speed comparison with Inkscape tree in all systems would probably be valuable. Then updating stuff on this page together with emails on the list would make the current status of the discussion easier to follow. I also added a DVCS vs SVN section, which is to be expanded (I just did not felt courageous enough to dig inside the email archive and exhume the old threads on the subject). For people thinking it is still a concern, please add your opinion there.
JiHO --- http://jo.irisson.free.fr/

-----Original Message----- From: Aaron Spike [mailto:aaron@...749...] Sent: maandag 17 maart 2008 13:27
one of the things I noticed is that any use of the log functionality in TortoiseSVN was unbearably slow because it requires network access. A DVCS stores all this information locally, so the logs become much more useful and accessible. I think you'll appreciate this.
This does indeed sound nice. So, DVCS stores *all* info offline? (A log without diffs is useless for me) Although I am used to having 100Mbps internet, I don't find SVN slow on my current wireless connection (don't know the exact speed... it's cable. 2Mbps or something). But in any case, an offline history might be nice I guess.
I really don't understand why you think that a DVCS would cause people to practice any worse VCS hygiene than now. DVCS gives you freedom to commit in smaller logical blocks. As a person who worked with branches in SVN extensively you should understand the utility of this. You likely committed much more frequently on your branch than you would have committed to trunk. The diffs were smaller and easier to read. The commit messages were more granular and more meaningful. And when working on a branch svn update is equivalent to merge. You didn't merge for every commit. You had the freedom to work out an idea while committing bite-sized pieces and dealing with the integration problems when you chose to. But you still dealt with them.
About my gsoc2007 svn experience/method: the frequent committing was mainly because of switching PC's actually and was possible because I was the only one committing to the branch. I think it is actually very unclear to commit in small chunks. Especially for reviewing or trying to understand what was needed for certain functionality. Perhaps the diff comments were more meaningful, but that's because it is more diffcomment text, because it is only meaningful in the context of all the other commits. Reverting one of those commits would be useless, you'd have to revert 10 successive ones. When people want more meaningful diffcomments, I think we should write longer diffcomments instead of doing more commits.
I am very afraid I will loose all nice features that SVN/Tortoise gives me now, without gaining anything myself. When reading TortoiseBZR: "TODO: Enable add, merge, log, revert, etc context menu options." :'(
Making your own local "branch" offline is already very easy btw: just copy your inkscape checkout, including already compiled files for fine speedy rebuild. It's what I sometimes do when working on a large thing for multiple days.
I think, grass is always much greener on the neighbour's lawn.
Regards, Johan

On 2008-March-17 , at 18:01 , J.B.C.Engelen@...1578... wrote:
From: Aaron Spike [mailto:aaron@...749...] Sent: maandag 17 maart 2008 13:27
one of the things I noticed is that any use of the log functionality in TortoiseSVN was unbearably slow because it requires network access. A DVCS stores all this information locally, so the logs become much more useful and accessible. I think you'll appreciate this.
This does indeed sound nice. So, DVCS stores *all* info offline? (A log without diffs is useless for me) Although I am used to having 100Mbps internet, I don't find SVN slow on my current wireless connection (don't know the exact speed... it's cable. 2Mbps or something). But in any case, an offline history might be nice I guess.
Yes it does. everything is available offline (expect push to a remote repository obviously). As for speed, I guess that it is only possible to really feel the difference by trying it. Once you get used to an immediate status or log it becomes a pain to wait even 2-4 s for it ;)
I really don't understand why you think that a DVCS would cause people to practice any worse VCS hygiene than now. DVCS gives you freedom to commit in smaller logical blocks. As a person who worked with branches in SVN extensively you should understand the utility of this. You likely committed much more frequently on your branch than you would have committed to trunk. The diffs were smaller and easier to read. The commit messages were more granular and more meaningful. And when working on a branch svn update is equivalent to merge. You didn't merge for every commit. You had the freedom to work out an idea while committing bite-sized pieces and dealing with the integration problems when you chose to. But you still dealt with them.
About my gsoc2007 svn experience/method: the frequent committing was mainly because of switching PC's actually and was possible because I was the only one committing to the branch. I think it is actually very unclear to commit in small chunks. Especially for reviewing or trying to understand what was needed for certain functionality.
The equivalent diff in a DVCS would be the one you push to the central repository.
Perhaps the diff comments were more meaningful, but that's because it is more diffcomment text, because it is only meaningful in the context of all the other commits. Reverting one of those commits would be useless, you'd have to revert 10 successive ones. When people want more meaningful diffcomments, I think we should write longer diffcomments instead of doing more commits.
I am very afraid I will loose all nice features that SVN/Tortoise gives me now, without gaining anything myself. When reading TortoiseBZR: "TODO: Enable add, merge, log, revert, etc context menu options." :'(
This, I am afraid, is likely... But that's probably a chicken and egg problem: when enough people use those, the tools will probably improve.
Making your own local "branch" offline is already very easy btw: just copy your inkscape checkout, including already compiled files for fine speedy rebuild. It's what I sometimes do when working on a large thing for multiple days.
But you cannot commit to it. This is what a DVCS would bring you.
I think, grass is always much greener on the neighbour's lawn.
I think that seeing many other major projects move towards these tools may be reassuring in this respect. If you have a little time free, reading about those can be insightful (bazaar wiki is particularly rich): http://bazaar-vcs.org/BzrWhy or there's always the google talk videos about git (watching a video is always nice right;) ): http://video.google.com/videoplay?docid=-2199332044603874737 http://www.youtube.com/watch?v=8dhZ9BXQgc4
JiHO --- http://jo.irisson.free.fr/

On Mon, 17 Mar 2008 18:01:48 +0100 J.B.C.Engelen@...1578... wrote:
I think it is actually very unclear to commit in small chunks.
Only your local, "personal", commits are likely to be "small chunks", others won't see them, and you don't have to make them unless you want. Personally I think it's great to be able to regularly "Save Game" locally and then merge / push to the shared system when I've actually accomplished something.
The leo-editor project I hack on a bit has just moved from CVS to launchpad/bzr. It's a smaller project than inkscape, but generally the transition has been a good thing. There's the EOLs of course, but you can deal with those. And bzr can accommodate almost any workflow, it can behave like CVS if you want... this is good but makes it a little confusing at first, although I think that can be overcome by more experienced users sharing recipes, that's what happened on the leo-editor project.
I think, grass is always much greener on the neighbour's lawn.
:-) good point, but I don't think there's much question than bzr/git/hg are more flexible, productive models than SVN - not that that helps if you want a friendly gui for bzr and there isn't one. I suspect that command line use is simpler in bzr than svn though.
Cheers -Terry

On Mon, 2008-03-17 at 07:26 -0500, Aaron Spike wrote:
One of the nice things about TortoiseSVN: when you do an SVN update, you can press the view log button that then shows *only* the log of revisions since the last svn update. This way it is very easy to keep track of what changed recently. Since the log UI of tortoisesvn is quite nice, you can see which files changed click on it to see the diff. Makes for easy reviewing when one wants it (I often check files of which I know the code), but does not demand reviewing.
I think the workflow for a DVCS is a bit different. For example I keep one branch to track the state of the main repo and another branch for each task I'm currently working on (branches rather than separate checkouts). Now I can update the state of the tracking branch as often as I want (independently of my working branch, like you are used to in svn). And because the system knows which revisions are in which branches I can check the log at anytime and get just the new commits by comparing the commits on each branch. In git the command for this is called `git whatchanged`. Also one of the things I noticed is that any use of the log functionality in TortoiseSVN was unbearably slow because it requires network access. A DVCS stores all this information locally, so the logs become much more useful and accessible. I think you'll appreciate this.
I'd agree with this, but I'll also note that it makes you lazy when you're trying to work with SVN at the base. I think Bryce was very unhappy when I sent a large patch for 0.46 and while it was broken into nice easy chunks on my branch I had no way to communicate that easily for integration into SVN.
This is also the reason that I'd like to have hosting available. I'm sure I could set up my git hosting, but I'd prefer not to. Sometimes I am working on things that other people would be interested in, but it stays on my computer because it's simpler that way. Being able to push stuff out there before committing it to the main repository would help my workflow.
I'll also comment (in a way that will probably get lost in this thread) that we need to give some lead time before a switch. Right now my SVK branch has changes on it, and I don't think any solution we choose will import from SVK nicely :) So, I need to finish and merge them before we switch.
--Ted

On 2008-March-17 , at 18:47 , Ted Gould wrote:
I'll also comment (in a way that will probably get lost in this thread) that we need to give some lead time before a switch. Right now my SVK branch has changes on it, and I don't think any solution we choose will import from SVK nicely :) So, I need to finish and merge them before we switch.
Well git wiki page says: "Subversion and svk repositories can be used directly with git-svn"
JiHO --- http://jo.irisson.free.fr/

On Mon, 17 Mar 2008 11:05:23 +0100, J.B.C.Engelen@...1578... wrote:
I am very much against doing offline 'commits', just from what I see happening in practice. Although it is nice to be able to 'save' work when there is no internet connection available, it removes an *essential* part from the workflow: SVN Update and checking whether it still works, possibly checking source code changes. I'm not sure whether everybody does this: 0. svn update
- code stuff
- svn update (!!!)
- make
- does it work? did merging go well?
- svn commit
2-5 in rapid succession and needs internet. This at least ensures that we don't get ';' missing buildbreakage etc, but also removes more complex problems. I think any offline CVS method removes 2 to 5.
This isn't a real issue; with a distributed SCM you still have to merge your local commits with upstream before the system will let you push them upstream.
-mental

On 2008-March-14 , at 13:04 , Aaron Spike wrote:
jiho wrote:
Since this supposes a lot of branching (for SoC, probably also for many large scale changes because the tree still needs to be usable during that time etc.) wouldn't it be a good time to change version control system to something that eases branching and merging? [...]
I think this would be a great time to encourage DVCS usage among the developers. But until a number of us have some experience from which to speak about the products, making a decision to use one would make me nervous.
From past discussion, I gathered that enough people had solid experience with one of those three systems. No one seemed to be an expert of all three but there is enough data to make an informed decision still.
As you may know a few people are already using DVCS systems to do their work on inkscape. Ted uses SVK (you can tell by the number of times that his commit messages say "SVK screwed up the last commit" :-) ). Mental uses git-svn (he described his use of git-svn for jruby at http://moonbase.rydia.net/mental/blog/programming/using-git-svn-for-jruby) . I've heard that a similar workflow is possible with bzr (http://bazaar-vcs.org/BzrForeignBranches/Subversion ). I have not heard anything of hg but I would expect something similar. We can try these tools now, without switching over the entire repo.
Having said that perhaps there is some worth in switching now.
I know git-svn for using it with all my svn-based repos now. It works nicely indeed. However, I still think the best solution would be to switch completely: - it would not cost much to each dev (for git at least, skimming over http://git.or.cz/course/svn.html takes 5 minutes, reading it thoroughly takes 30. not a big deal) - it would avoid a two step process for everyone (svn co first and then git-svn init if the person wants to do refactoring work on his/ her side) - the only situation I see where svn is superior is regarding the integration with online repository browsers and tracking systems such as Trac (there are plugins for git etc. but svn is the default and better supported). But Inkscape do not use such a thing anyway and the online browser of SourceForge is crap enough to be worth replacing by git's web interface. NB: I never used the new interface in Launchpad. I don't know wether it would integrate with git.
OK, now I think others probably have more informed opinions. Please copy the point by point summary below and add to it.
-- GIT ---------------------------- Good - very fast (but anything would be faster than svn anyway) - allows to stash away local changes to fix a small issue. probably useful when, in the middle of a large code change, one notices a local bug which does not have anything to do with the change (not sure this is git-only) - ability to follow chunk of codes around, without relying on file names. Probably useful from a refactoring point of view - probably the fastest growing user base => many tools - TextMate bundle Bad
-- HG ----------------------------- Good - tortoiseHG - TextMate bundle Bad
-- BZR ---------------------------- Good - already integrated in launchpad - supports renames (could be considered a Bad by git users) - supports bundling changesets - tortoiseBZR Bad
JiHO --- http://jo.irisson.free.fr/

On Fri, Mar 14, 2008 at 02:07:29PM +0100, jiho wrote:
On 2008-March-14 , at 13:04 , Aaron Spike wrote:
As you may know a few people are already using DVCS systems to do their work on inkscape. Ted uses SVK (you can tell by the number of times that his commit messages say "SVK screwed up the last commit" :-) ). Mental uses git-svn (he described his use of git-svn for jruby at http://moonbase.rydia.net/mental/blog/programming/using-git-svn-for-jruby) . I've heard that a similar workflow is possible with bzr (http://bazaar-vcs.org/BzrForeignBranches/Subversion ). I have not heard anything of hg but I would expect something similar. We can try these tools now, without switching over the entire repo.
Having said that perhaps there is some worth in switching now.
I know git-svn for using it with all my svn-based repos now. It works nicely indeed. However, I still think the best solution would be to switch completely:
- it would not cost much to each dev (for git at least, skimming over http://git.or.cz/course/svn.html takes 5 minutes, reading it thoroughly takes 30. not a big deal)
- it would avoid a two step process for everyone (svn co first and
then git-svn init if the person wants to do refactoring work on his/ her side)
- the only situation I see where svn is superior is regarding the
integration with online repository browsers and tracking systems such as Trac (there are plugins for git etc. but svn is the default and better supported). But Inkscape do not use such a thing anyway and the online browser of SourceForge is crap enough to be worth replacing by git's web interface. NB: I never used the new interface in Launchpad. I don't know wether it would integrate with git.
I agree this is the right time to talk about switching the repo. Not only to upgrade from SVN, but because SourceForge's SVN service is imperfect (this last month in the middle of our release they expired everyone's SVN passwords with no notice or reason given).
OK, now I think others probably have more informed opinions. Please copy the point by point summary below and add to it.
-- GIT ---------------------------- Good
- very fast (but anything would be faster than svn anyway)
- allows to stash away local changes to fix a small issue. probably
useful when, in the middle of a large code change, one notices a local bug which does not have anything to do with the change (not sure this is git-only)
- ability to follow chunk of codes around, without relying on file
names. Probably useful from a refactoring point of view
- probably the fastest growing user base => many tools
- TextMate bundle
(What is TextMate?)
Bad
- Git is designed with the Linux kernel's hierarchical workflow in mind, so may require alteration in how Inkscape's workflow works. - We would need to set up and administer this ourselves - Revisions are indicated with SHA's rather than numbers - Commands dissimilar to what we're used to with svn (e.g., no 'git update')
-- HG ----------------------------- Good
- tortoiseHG
- TextMate bundle
Bad
-- BZR ---------------------------- Good
- already integrated in launchpad
http://doc.bazaar-vcs.org/bzr.dev/en/tutorials/using_bazaar_with_launchpad.h... + Associating branches to bugs + Associating branches to blueprints + Changes state of bugs in LP when committing in Bazaar + Web interface for creating branches quickly and easily
- supports renames (could be considered a Bad by git users)
- supports bundling changesets
- tortoiseBZR
- Fewer commands and built-in help - makes it easier to learn - Supports multiple workflows, including what we use for Inkscape. (See http://bazaar-vcs.org/Workflows) - Many plugins - Easy administration. Mostly done through Launchpad, plus Bzr folks are available to help. - Git uses SHAs for revisions (e.g. 824853772241acf64bc37ac8b85254194741ae13) where bzr uses numbers (e.g., 1, 2, 3.2, 63.3.5.9). bzr has UUIDs internally, but not needed in day-to-day use.
Bad
I happened to be working on two small new projects with other folks in recent months, one under git, the other under bzr, which for me highlighted the benefits of bzr.
For the bzr project, I essentially needed to branch an existing codebase and make some small modifications to it. I located the project on Launchpad, and on the 'code' tab clicked 'Register branch', and it gave me the command to upload my branched code. A few minutes later I was off and running. I hadn't studied any git commands, but just guessed based on my svn experience and everything worked as I expected.
For the git project, the other contributor strongly wanted git, because it was his aim for the code to be adopted by debian. I went through the guidelines for setting up a git repo at debian (took a couple hours to get it all sorted out properly), and let the contributor know. He did not actually know git, so it took him some time to get up to speed, but before long he was sending me diffs to integrate. However, having to wait for me to commit his work created an annoying bottleneck. Perhaps someone with more git know-how would know how to enable multiple committers, but it was not obvious to us git neophytes. Finally, the other contributor switched us over to bzr, and there's been no trouble since.
In both cases, I could take advantage of the distributed form, doing commits for each change individually, and then running a '*** push' when ready to share publically. There wasn't much performance difference with commits between the two, however I did notice git to be faster on pushes, but bzr was definitely fast enough - no more than the time to take a sip of coffee; definitely faster than svn.
There were also a lot of other little differences that made bzr a bit more convienent. For instance, in git if you make changes to several files, you have individually do 'git add' to each in order for them to be included in the commit, or you can do 'git commit -a' which commits all changes. In either case it's different than you'd expect. In bzr, you just do 'bzr commit' and it commits all changes, just like svn.
I've also done merges in git and bzr. With git, the first time I did a merge it was a minor nightmare trying to figure out what options I needed to run. So I was apprehensive the first time I was asked to do a bzr merge. However, it was almost trivially easy.
Bryce

On 2008-March-14 , at 21:05 , Bryce Harrington wrote:
On Fri, Mar 14, 2008 at 02:07:29PM +0100, jiho wrote:
On 2008-March-14 , at 13:04 , Aaron Spike wrote:
[..]. We can try these tools now, without switching over the entire repo. [..]
However, I still think the best solution would be to switch completely: [...]
I agree this is the right time to talk about switching the repo. Not only to upgrade from SVN, but because SourceForge's SVN service is imperfect (this last month in the middle of our release they expired everyone's SVN passwords with no notice or reason given).
- TextMate bundle
(What is TextMate?)
TextMate is a (very very cool ;) ) editor for OS X: http://macromates.com/ The reason why I mentioned it is because, first, it is very popular and, second, since there is no real equivalent to TortoiseSVN on OS X, people usually rely on their editor/IDE to provide an interface to version control.
<digression> TextMate provides an interface for SVN, Git and Mercurial, through collaboratively developed bundles, nothing for bazaar (this project is quite odd: the editor itself is shareware and definitely closed but the development of functionality goes through language bundles which are community driven. In this respect it is very much like any average OS project). However I may be able to come up with something for bazaar in June: the basic functionality is made available by TextMate core. What's missing are a nice set of scripts to glue all that together, and, this, I know how to do iven a little time ;) Fore those wondering, there is a windows equivalent of TM: E http://www.e-texteditor.com/ . Sadly, nothing on linux but "port to linux" is probably the N# 1 request on TextMate's ticket system. I must admit that this is also the N# 1 reason that made me buy another mac and stick on OS X instead of switching to Ubuntu. It's that good. </digression>
OK, summary and comments:
-- GIT ---------------------------- + maybe a bit faster at commits + allows to stash away local changes to fix a small issue. probably useful when, in the middle of a large code change, one notices a local bug which does not have anything to do with the change -> not sure this is git-only + ability to follow chunk of codes around, without relying on file names. Probably useful from a refactoring point of view + probably the fastest growing user base => many tools + TextMate bundle
- git is designed with the Linux kernel's hierarchical workflow in mind (few 'push'ers to the central repos), so may require alteration in how Inkscape's workflow works (anyone of the numerous contributors can commit) -> this workflow may be particularly suited for the refactoring part though, it would force code review - we would need to set up and administer this ourselves -> anything moving us from sourceforge would be the same, wouldn't it? and moving from sourceforge seemed like a good thing (cf. you, above) - revisions are indicated with SHA's rather than numbers - commands dissimilar to what we're used to with svn (e.g., no 'git update') - have to git add all modified files before committing them (or resort to a non specific commit -a)
-- HG ----------------------------- + tortoiseHG + TextMate bundle
-- BZR ---------------------------- + already integrated in launchpad: http://doc.bazaar-vcs.org/bzr.dev/en/tutorials/using_bazaar_with_launchpad.h... + Associating branches to bugs + Associating branches to blueprints + Changes state of bugs in LP when committing in Bazaar + Web interface for creating branches quickly and easily + supports renames (could be considered a Bad by git users) + supports bundling changesets + tortoiseBZR + Fewer commands - makes it easier to learn -> suppressed built-in help since it is probably true for all of them (git help and man page are OK and I can't imagine that Hg does not have help) + Supports multiple workflows, including what we use for Inkscape (See http://bazaar-vcs.org/Workflows) + Many plugins -> not sure what this means + Easy administration. Mostly done through Launchpad, plus Bzr folks are available to help. + bzr uses numbers for revisions(e.g., 1, 2, 3.2, 63.3.5.9). bzr has UUIDs internally, but not needed in day-to-day use.
Bazaar looks very cool. Bu is there anything like stash in bzr? It is incredibly useful and I suspect it will be even more when people will start looking at the code in more detail during refactoring.
JiHO --- http://jo.irisson.free.fr/

On 2008-March-14 , at 23:03 , Aaron Spike wrote:
jiho wrote:
Bazaar looks very cool. Bu is there anything like stash in bzr?
bzr shelve
Ok then stash is not git's advantage anymore:
-- GIT ---------------------------- + maybe a bit faster at commits + ability to follow chunk of codes around, without relying on file names. Probably useful from a refactoring point of view + probably the fastest growing user base => many tools + TextMate bundle
- git is designed with the Linux kernel's hierarchical workflow in mind (few 'push'ers to the central repos), so may require alteration in how Inkscape's workflow works (anyone of the numerous contributors can commit) - we would need to set up and administer this ourselves seemed like a good thing (cf. you, above) - revisions are indicated with SHA's rather than numbers - commands dissimilar to what we're used to with svn (e.g., no 'git update') - have to git add all modified files before committing them (or resort to a non specific commit -a)
-- HG ----------------------------- + tortoiseHG + TextMate bundle
-- BZR ---------------------------- + already integrated in launchpad: http://doc.bazaar-vcs.org/bzr.dev/en/tutorials/using_bazaar_with_launchpad.h... + Associating branches to bugs + Associating branches to blueprints + Changes state of bugs in LP when committing in Bazaar + Web interface for creating branches quickly and easily + supports renames (could be considered a Bad by git users) + supports bundling changesets + tortoiseBZR + Fewer commands - makes it easier to learn + Supports multiple workflows, including what we use for Inkscape (See http://bazaar-vcs.org/Workflows) + Easy administration. Mostly done through Launchpad, plus Bzr folks are available to help. + bzr uses numbers for revisions(e.g., 1, 2, 3.2, 63.3.5.9). bzr has UUIDs internally, but not needed in day-to-day use.
JiHO --- http://jo.irisson.free.fr/

On Fri, 2008-03-14 at 23:20 +0100, jiho wrote:
-- BZR ----------------------------
- already integrated in launchpad: http://doc.bazaar-vcs.org/bzr.dev/en/tutorials/using_bazaar_with_launchpad.h...
- Associating branches to bugs
- Associating branches to blueprints
- Changes state of bugs in LP when committing in Bazaar
- Web interface for creating branches quickly and easily
I don't know if this is overlap, but I think a strong advantage of BZR is that there is publicly available hosting. So if a new contributor wants to provide a change they either have to make a diff, or find some way to host their branch. The later is taken care of already for us in the BZR case. So, I guess the bullet point is:
+ publicly available hosting
The other BZR plus for me is the PQM solution where one can set up a situation where something like "make check" is run on every commit. (see Jon's earlier e-mail) I realize this could be built with other systems, but again it is already build and maintained for us. Bullet:
+ commit based build checking tool available
A minor plus:
+ Bazaar is a part of the GNU project
And lastly:
+ Ted's used it before and hasn't been offended ;)
--Ted

On Fri, 14 Mar 2008 15:55:27 -0700, Ted Gould <ted@...11...> wrote:
I don't know if this is overlap, but I think a strong advantage of BZR is that there is publicly available hosting. So if a new contributor wants to provide a change they either have to make a diff, or find some way to host their branch. The later is taken care of already for us in the BZR case. So, I guess the bullet point is:
- publicly available hosting
Well, not only that, but the bzr accounts would be integrated with the accounts for our bug tracker too.
-mental

On 2008-March-15 , at 00:00 , MenTaLguY wrote:
On Fri, 14 Mar 2008 15:55:27 -0700, Ted Gould <ted@...11...> wrote:
I don't know if this is overlap, but I think a strong advantage of BZR is that there is publicly available hosting. So if a new contributor wants to provide a change they either have to make a diff, or find some way to host their branch. The later is taken care of already for us in the BZR case. So, I guess the bullet point is:
- publicly available hosting
Well, not only that, but the bzr accounts would be integrated with the accounts for our bug tracker too.
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
PS:
And lastly:
- Ted's used it before and hasn't been offended ;)
That was the selling point for me :P
JiHO --- http://jo.irisson.free.fr/

On Sat, 15 Mar 2008 00:21:47 +0100, jiho <jo.irisson@...400...> wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
Well, first I think we need to do a trial import into bzr to make sure that performance is acceptable?
-mental

On Fri, 2008-03-14 at 17:05 -0700, MenTaLguY wrote:
On Sat, 15 Mar 2008 00:21:47 +0100, jiho <jo.irisson@...400...> wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
Well, first I think we need to do a trial import into bzr to make sure that performance is acceptable?
https://code.edge.launchpad.net/~vcs-imports/inkscape/main
Unfortunately the vcs-imports are only available via http, which is slower that the bzr+ssh protocol. But, if you build a branch in your account you can pull that in and then push and do what ever in your branch with bzr+ssh.
--Ted

On Fri, 2008-03-14 at 17:05 -0700, MenTaLguY wrote:
On Sat, 15 Mar 2008 00:21:47 +0100, jiho <jo.irisson@...400...> wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
Well, first I think we need to do a trial import into bzr to make sure that performance is acceptable?
I don't know what you define there, but I was curious. So I branched the import (10 minutes) changed the changelog and pushed it into a private branch (35 minutes). Made another change to the changelog and pushed it (15 seconds). The whole repository is about 80 MB and my DSL is 768/128.
Any other metrics?
--Ted

On Fri, Mar 14, 2008 at 09:35:55PM -0700, Ted Gould wrote:
On Fri, 2008-03-14 at 17:05 -0700, MenTaLguY wrote:
On Sat, 15 Mar 2008 00:21:47 +0100, jiho <jo.irisson@...400...> wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
Well, first I think we need to do a trial import into bzr to make sure that performance is acceptable?
I don't know what you define there, but I was curious. So I branched the import (10 minutes) changed the changelog and pushed it into a private branch (35 minutes). Made another change to the changelog and pushed it (15 seconds). The whole repository is about 80 MB and my DSL is 768/128.
Any other metrics?
To give another data point, I branched ted's branch (6m15.4s), modified the Changelog, committed locally (13.1s), registered a new branch that anyone on the Inkscape-Admin team can commit to, and then pushed up the new branch (39m19.8s). A local commit to Changelog took 2.8s, and push took 15.4 sec (these are more representative of day-to-day work). I then proposed to merge into Ted's branch. This is with Comcast cablemodem with similar up/down rates as ted.
I also gave it a try on an account on one of the canonical.com servers. branch: 46s, commit: 4.2s. So network performance seems to be a large driver.
Anyone got comparable numbers for git?
Bryce

On Fri, 2008-03-14 at 23:49 -0700, Bryce Harrington wrote:
On Fri, Mar 14, 2008 at 09:35:55PM -0700, Ted Gould wrote:
On Fri, 2008-03-14 at 17:05 -0700, MenTaLguY wrote:
On Sat, 15 Mar 2008 00:21:47 +0100, jiho <jo.irisson@...400...> wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
Well, first I think we need to do a trial import into bzr to make sure that performance is acceptable?
I don't know what you define there, but I was curious. So I branched the import (10 minutes) changed the changelog and pushed it into a private branch (35 minutes). Made another change to the changelog and pushed it (15 seconds). The whole repository is about 80 MB and my DSL is 768/128.
Any other metrics?
To give another data point, I branched ted's branch (6m15.4s), modified the Changelog, committed locally (13.1s), registered a new branch that anyone on the Inkscape-Admin team can commit to, and then pushed up the new branch (39m19.8s). A local commit to Changelog took 2.8s, and push took 15.4 sec (these are more representative of day-to-day work). I then proposed to merge into Ted's branch. This is with Comcast cablemodem with similar up/down rates as ted.
I also gave it a try on an account on one of the canonical.com servers. branch: 46s, commit: 4.2s. So network performance seems to be a large driver.
For me, with git, a remote clone took about 5 minutes, pushing to a new remote branch in the same remote repository took 4 seconds, pushing to a new remote branch in a fresh remote repository took about 6 minutes, the local commit took less than a second, and the subsequent push to an established remote branch took just over a second.
Note that this was with the partial history since the switch to subversion (since that's what I have easily available in git right now); with a full history I would expect the times for remote initial clone or push to be at least a little longer (though not linearly so, since git sends deltas). Anyway, it sounds like git and bzr at least within the same order of magnitude for cloning, so that's cool.
I'm not sure what's up with pushing, though; I find the idea of waiting more than half an hour for a full push to be really appalling, I'm not really keen on waiting 15 seconds for a trivial push to finish either, and I'm kinda used to local commits being more or less instantaneous (what is there to do that takes four seconds?).
I don't see that things are so bad for users accustomed to SVN, though. The one issue I do see being a concern for SVN users is the initial remote push time. With SVN, creating a new remote branch is effectively instantaneous (as it is with git when you're branching within the same repository). HALF AN HOUR is nuts!
Is there a way to have multiple branches in the same repository with bzr, or faster ways to do it when both branches are hosted on the same server?
-mental

On Sat, Mar 15, 2008 at 04:46:47PM -0400, MenTaLguY wrote:
On Fri, 2008-03-14 at 23:49 -0700, Bryce Harrington wrote:
On Fri, Mar 14, 2008 at 09:35:55PM -0700, Ted Gould wrote:
On Fri, 2008-03-14 at 17:05 -0700, MenTaLguY wrote:
On Sat, 15 Mar 2008 00:21:47 +0100, jiho <jo.irisson@...400...> wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
Well, first I think we need to do a trial import into bzr to make sure that performance is acceptable?
I don't know what you define there, but I was curious. So I branched the import (10 minutes) changed the changelog and pushed it into a private branch (35 minutes). Made another change to the changelog and pushed it (15 seconds). The whole repository is about 80 MB and my DSL is 768/128.
Any other metrics?
To give another data point, I branched ted's branch (6m15.4s), modified the Changelog, committed locally (13.1s), registered a new branch that anyone on the Inkscape-Admin team can commit to, and then pushed up the new branch (39m19.8s). A local commit to Changelog took 2.8s, and push took 15.4 sec (these are more representative of day-to-day work). I then proposed to merge into Ted's branch. This is with Comcast cablemodem with similar up/down rates as ted.
I also gave it a try on an account on one of the canonical.com servers. branch: 46s, commit: 4.2s. So network performance seems to be a large driver.
For me, with git, a remote clone took about 5 minutes, pushing to a new remote branch in the same remote repository took 4 seconds, pushing to a new remote branch in a fresh remote repository took about 6 minutes, the local commit took less than a second, and the subsequent push to an established remote branch took just over a second.
Just to make sure we're comparing apples to apples, would you mind also repeating the procedure with bzr, so we can rule out differences in network performance?
Bryce

Do we have a reference point for SVN on any of these numbers?
On Sun, Mar 16, 2008 at 3:56 PM, Bryce Harrington <bryce@...1798...> wrote:
On Sat, Mar 15, 2008 at 04:46:47PM -0400, MenTaLguY wrote:
On Fri, 2008-03-14 at 23:49 -0700, Bryce Harrington wrote:
On Fri, Mar 14, 2008 at 09:35:55PM -0700, Ted Gould wrote:
On Fri, 2008-03-14 at 17:05 -0700, MenTaLguY wrote:
On Sat, 15 Mar 2008 00:21:47 +0100, jiho <jo.irisson@...400...>
wrote:
I'm not sure there's really a need to continue counting points
right
now. It seems bazaar is the clear winner. Is there someone with
a
strong opinion about that? Who's organising the switch? ;)
Well, first I think we need to do a trial import into bzr to make
sure
that performance is acceptable?
I don't know what you define there, but I was curious. So I
branched
the import (10 minutes) changed the changelog and pushed it into a private branch (35 minutes). Made another change to the changelog
and
pushed it (15 seconds). The whole repository is about 80 MB and my
DSL
is 768/128.
Any other metrics?
To give another data point, I branched ted's branch (6m15.4s),
modified
the Changelog, committed locally (13.1s), registered a new branch that anyone on the Inkscape-Admin team can commit to, and then pushed up
the
new branch (39m19.8s). A local commit to Changelog took 2.8s, and
push
took 15.4 sec (these are more representative of day-to-day work). I then proposed to merge into Ted's branch. This is with Comcast cablemodem with similar up/down rates as ted.
I also gave it a try on an account on one of the canonical.comservers. branch: 46s, commit: 4.2s. So network performance seems to be a large driver.
For me, with git, a remote clone took about 5 minutes, pushing to a new remote branch in the same remote repository took 4 seconds, pushing to a new remote branch in a fresh remote repository took about 6 minutes, the local commit took less than a second, and the subsequent push to an established remote branch took just over a second.
Just to make sure we're comparing apples to apples, would you mind also repeating the procedure with bzr, so we can rule out differences in network performance?
Bryce
This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel

On Sun, 2008-03-16 at 12:56 -0700, Bryce Harrington wrote:
For me, with git, a remote clone took about 5 minutes, pushing to a new remote branch in the same remote repository took 4 seconds, pushing to a new remote branch in a fresh remote repository took about 6 minutes, the local commit took less than a second, and the subsequent push to an established remote branch took just over a second.
Just to make sure we're comparing apples to apples, would you mind also repeating the procedure with bzr, so we can rule out differences in network performance?
I get 5 minutes for branch, 17 minutes for remote push to a new branch, 4 seconds for commit, and 10 seconds for a push to an existing remote branch.
The 17 minutes is a bit more livable, but I do wonder -- shouldn't there be some way to have bzr branches hosted in the same location share revision data, so you don't need to copy so much around?
-mental

On Sun, Mar 16, 2008 at 08:56:59PM -0400, MenTaLguY wrote:
On Sun, 2008-03-16 at 12:56 -0700, Bryce Harrington wrote:
For me, with git, a remote clone took about 5 minutes, pushing to a new remote branch in the same remote repository took 4 seconds, pushing to a new remote branch in a fresh remote repository took about 6 minutes, the local commit took less than a second, and the subsequent push to an established remote branch took just over a second.
Just to make sure we're comparing apples to apples, would you mind also repeating the procedure with bzr, so we can rule out differences in network performance?
I get 5 minutes for branch, 17 minutes for remote push to a new branch, 4 seconds for commit, and 10 seconds for a push to an existing remote branch.
So it sounds like except for the remote push, the numbers you're seeing for bzr are roughly within range of what you're seeing with git?
The 17 minutes is a bit more livable, but I do wonder -- shouldn't there be some way to have bzr branches hosted in the same location share revision data, so you don't need to copy so much around?
Yeah, see Martin Pool's response to my enquiry about this. It seems that while launchpad makes it very simple & convenient to make branches, the implementation currently deployed is not set up for shared storage. Sounds like if cloning performance is a killer issue for us we could self-host (in which case we lose the launchpad integration benefits), until launchpad gains this ability.
Bryce

On Mar 14, 2008, at 4:21 PM, jiho wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner. Is there someone with a strong opinion about that? Who's organising the switch? ;)
Just on my part/preference...
Git lacks GUI, bzr is a bit behind on GUI but catching up, SVN is decent. Not great but decent.
Git does the job it's designed for very well, but for many people it's not the best approach. I jump between GUI clients and command- line. On Windows I'd use TkCVS, TortoiseCVS, and command-line CVS together, and for different aspects. With SVN I use command-line and TkSVN.
Lack of Bzr integration to Eclipse was the last main negative I saw on it. SVN integration was ok, but slowed.
I've also done a survey of systems recently for other reasons, and Bzr seems to be closest to getting better than SVN.
One main drawback I find with SVN is its lack of tagging. The switch to SVN, in fact, caught me just when I was trying to add tagging to branch merges to get nice history, but SVN can't handle that. :-(
http://www.twobarleycorns.net/tkcvs/screen-branch.html
I'd definitely want to be sure we'd not hit any other hidden limitations like SVN's lack of tagging.
Now if only Git or bzr had a GUI like Perforce's... :-)

On Sat, Mar 15, 2008 at 12:21:47AM +0100, jiho wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner.
I've been composing the below before the above was posted.
Looking at http://en.wikipedia.org/wiki/Comparison_of_revision_control_software, there is one startling difference among git,hg,bzr: it is stated that bzr cannot do end-of-line conversions, i.e. cannot adapt the end of line characters for text files such that they match the end of line style for the operating system under which it is used. (Status: http://bazaar-vcs.org/LineEndings)
I wonder how much of an issue this is; could it actually be fatal for what is otherwise a very promising alternative? Let's look at how we would address this issue if we were to choose bzr.
If a commit includes a change that adds or removes CR to/from the end of each line, then it tends to make it awkward to merge changes for other people (at least in most systems, I suppose in bzr too). Thus, it would be nice if we could ensure that the "right" line endings are in use at commit time (converting or refusing to commit rather than committing the wrong thing) so that we never have commits that change the line endings in the repository version of the file. The checkeol plugin (http://bazaar.launchpad.net/~bialix/+junk/checkeol) promises to do this. How does one set it up -- can we make it enforce our chosen line ending convention for new files, or will it only check that existing files keep their existing line ending convention? If checkeol works well then we just need to decide what the "right" line ending is for each file type, and hope we never need to change our mind.
One issue is that, for traditional Unix tools, backslash-CR-LF means something quite different from backslash-LF. I've just tested: DOS line endings work fine for g++ and gnu make 3.81 (though I seem to recall that earlier versions didn't accept DOS line endings for backslash purposes), but not for shell script (either dash or bash).
It's been a while since I've used cygwin, but I believe that cygwin slightly prefers Unix line endings: that Unix line endings always work, while cygwin can be configured to allow DOS line endings.
So at least for shell scripts, we want Unix line endings.
I believe that all popular editors on both Un*x, MacOS and Windows will happily edit files regardless of their line-ending convention, preserving that convention for lines added; except that the ever-present Windows Notepad requires DOS line endings. (Notepad isn't a programming editor, it's just the default text file editor, and I've come across people editing source files with it.) My information is rather dated, though. (Does the default text editor in Windows still not play well with Unix line endings?) Can anyone name other popular editors that don't work well with non-native line endings? vi other than vim don't handle non-Unix line endings, though I don't know if any Inkscape developers would use vi other than vim; solaris users perhaps?
The main annoyance regardless of platform (depending on whether checkeol converts or merely warns) is when adding files, particularly when adding a large number of files. Tools like fromdos/todos will help, once installed. (Extra barrier for Inkscape development participation.)
Does anyone feel like advancing the state of this feature in bzr ?
Otherwise, what convention would we choose for text files other than shell scripts? READMEs I really don't care one way or another. C++ source code I hope we go with Unix [no doubt influenced by my being a Un*x user] but can see arguments for choosing DOS line endings.
pjrm.

On Sat, 2008-03-15 at 18:09 +1100, Peter Moulder wrote:
On Sat, Mar 15, 2008 at 12:21:47AM +0100, jiho wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner.
I've been composing the below before the above was posted.
Looking at http://en.wikipedia.org/wiki/Comparison_of_revision_control_software, there is one startling difference among git,hg,bzr: it is stated that bzr cannot do end-of-line conversions, i.e. cannot adapt the end of line characters for text files such that they match the end of line style for the operating system under which it is used. (Status: http://bazaar-vcs.org/LineEndings)
I've never seen this sort of feature work out well in practice. Invariably, binary files get marked as text or vice-versa. I've seen way too many problems over the years with files getting erroneously marked as "binary" or "text" when they weren't.
As far as I know, it's also not a feature we've been using in Subversion.
-mental

Quoting MenTaLguY <mental@...3...>:
On Sat, 2008-03-15 at 18:09 +1100, Peter Moulder wrote:
On Sat, Mar 15, 2008 at 12:21:47AM +0100, jiho wrote:
I'm not sure there's really a need to continue counting points right now. It seems bazaar is the clear winner.
I've been composing the below before the above was posted.
Looking at http://en.wikipedia.org/wiki/Comparison_of_revision_control_software, there is one startling difference among git,hg,bzr: it is stated that bzr cannot do end-of-line conversions, i.e. cannot adapt the end of line characters for text files such that they match the end of line style for the operating system under which it is used. (Status: http://bazaar-vcs.org/LineEndings)
I've never seen this sort of feature work out well in practice. Invariably, binary files get marked as text or vice-versa. I've seen way too many problems over the years with files getting erroneously marked as "binary" or "text" when they weren't.
As far as I know, it's also not a feature we've been using in Subversion.
Eeeeek!!!!
As far as I know, it's a feature we've used in *all* source control software. Even when I was using RCS only it would do the proper thing. If I check out onto a linux box, I get LF. If I check out on a Windows box I get CR-LF.
Does Bazaar really not do this?!?!?!?!

On Sat, 15 Mar 2008 18:56:38 -0300 jon@...18... wrote:
As far as I know, it's a feature we've used in *all* source control software. Even when I was using RCS only it would do the proper thing. If I check out onto a linux box, I get LF. If I check out on a Windows box I get CR-LF.
Does Bazaar really not do this?!?!?!?!
It really does not - I do some stuff on another project which just switched, and ran into this. I think, not sure, that pre/post checkin/out filters in bzr aren't mature / implemented yet but could in theory be used.
However, the bzr belief, which I think probably holds true most of the time, is that most good editors will handle the various conventions transparently, so it's not an issue.
Of course the project I referred to above, the leo python editor / hierarchical data manager, doesn't transparently handle the eols, for fairly good reasons (it's not really a text editor).
I wrote a small python script that compares the working copy text file eols with the last committed version and offers to adjust to working copy versions to match, running this before diffing or committing works well for me.
Cheers -Terry

Quoting Terry Brown <terry_n_brown@...36...>:
However, the bzr belief, which I think probably holds true most of the time, is that most good editors will handle the various conventions transparently, so it's not an issue.
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEK!!!!!
My ...mumble...mumble... years of experience in the software field has shown the exact opposite.
While some editors deal with confused users decently, most do not. Especially since it's hard to know in which way users are not doing things well. MS's DevStudio is one of the worst for mixing line-ends, but others do it too.
Usually this will show up when someone spends lots of time tracking down weird bugs only to find out a line-end change messed up a macro with a continuation char at the end... things like that.
Even with SVN and such I've had to clean things up in Inkscape code now and then. Someone will check out and dev on one box, then check in from another. Often with Samba mounts or shared USB drives involved. Wreaks havoc with diffs and file history among other problems.
(then again if one never looks at file history and diffs, then one might never see a 'problem')

On Sat, Mar 15, 2008 at 05:15:33PM -0400, MenTaLguY wrote:
As far as I know, it's [...] not a feature we've been using in Subversion.
We have indeed been using it in subversion. Though we haven't had it set up to be added to new files automatically, so for example I see that the newly-added src/bind subtree doesn't use the feature. In the past, I've occasionally manually enabled the feature on new files (r1623{5,6,7}, r16085, r16043, r11156, r11011 according to svn log).
I've never seen this sort of feature work out well in practice. Invariably, binary files get marked as text or vice-versa. I've seen way too many problems over the years with files getting erroneously marked as "binary" or "text" when they weren't.
I have a feeling that svn handles it better than CVS, though I don't recall what the difference is. Maybe svn is better at guessing which files are binary. The absence of default keyword substitution can't hurt.
In most cases, problems can be fixed after the fact just by changing how the file is marked.
I agree that differing line-ending conventions are a source of pain, but I'm not sure that witholding conversion is the best way of avoiding pain: that option tends to leave inkscape with some files using CR-LF and some files using LF, and revisions that include changing the line-ending format of existing lines (which makes merging a pain).
Thus, if we are to choose bzr, then I believe we want to decide what line-ending convention to use for each type of text file, and use an automated commit-time check to enforce that convention.
Or we work on bzr to support an equivalent of svn's svn:eol-style property. Or we choose a DVCS other than bzr.
pjrm.

Hi!
Yes, and I have noticed that there is a certain "stickiness," in that if a file is initially added with dos line endings, SVN (at least mine) tends to keep it that way.
Some of the files in question are the svg/smil/views/css/stylesheets "semi-official" Java interfaces classes from w3c.org's site. I've noticed that at least the svg set of original files had dos line endings, so I've run dos2unix on the entire directory.
Maybe people would like to give this a try, as an experiment:
find . -print -exec dos2unix --safe {} ;
Tho, if svn considers -every- line different, it could cause transmission of the entire file.
bob
Peter Moulder wrote:
We have indeed been using it in subversion. Though we haven't had it set up to be added to new files automatically, so for example I see that the newly-added src/bind subtree doesn't use the feature. In the past, I've occasionally manually enabled the feature on new files (r1623{5,6,7}, r16085, r16043, r11156, r11011 according to svn log).

Bryce Harrington wrote:
What will the codebase cleanup work entail? The work will range from straightforward "grunt" work like making some simple code changes to all files in the codebase, to meatier work like abstracting widely used code into a more concise and powerful algorithm, to advanced work such as extracting distinct code into independent code packages.
I briefly mentioned this tool on chat, but I figured I'd throw it on the list for everyone to take a look at (and I'll add it to the wiki later):
http://pmd.sourceforge.net/cpd.html
I've run the current Inkscape codebase through the Copy/Paste Detector and found plenty of duplicated code that, at the very least, can be refactored to look nice and reduce the SLOC count. I've already done a little of this in pen-context.cpp, and plan on tackling some of the parts of the livarot code and some of the dialog code that look ripe for easy refactoring.
John

On Fri, Mar 14, 2008 at 08:41:28AM -0400, John Bintz wrote:
Bryce Harrington wrote:
What will the codebase cleanup work entail? The work will range from straightforward "grunt" work like making some simple code changes to all files in the codebase, to meatier work like abstracting widely used code into a more concise and powerful algorithm, to advanced work such as extracting distinct code into independent code packages.
I briefly mentioned this tool on chat, but I figured I'd throw it on the list for everyone to take a look at (and I'll add it to the wiki later):
Looks like this is in the roadmap. I think it's a good idea.
Note that we may also have "duplicate" code in terms of functionality, which is not cut/paste, but rather reimplementation of the same essential functionality (cxxtest and utest as a case in point.) I'm not sure that CPD would detect such cases, so while that may catch a lot of the more egregious cases, we need to not assume it will solve the issue 100%.
Bryce

On Fri, Mar 14, 2008 at 9:41 AM, John Bintz <jcoswell@...1414...> wrote:
I've run the current Inkscape codebase through the Copy/Paste Detector and found plenty of duplicated code that, at the very least, can be refactored to look nice and reduce the SLOC count. I've already done a little of this in pen-context.cpp, and plan on tackling some of the parts of the livarot code and some of the dialog code that look ripe for easy refactoring.
I don't think it's worth it to spend resources on livarot - eventually it's going to be replaced by cairo and 2geom.

Bryce Harrington wrote:
What will the codebase cleanup work entail? The work will range from straightforward "grunt" work like making some simple code changes to all files in the codebase, to meatier work like abstracting widely used code into a more concise and powerful algorithm, to advanced work such as extracting distinct code into independent code packages.
To boil this down into five high level objectives:
- Complete some of the big architectural refactoring efforts
- Reduce source code line count
- Break useful code out into stand-alone libraries
- Increase code stylistic consistency
- Make the codebase more convenient to code in
Does this also mean keeping multithreading in mind? https://bugs.launchpad.net/inkscape/+bug/200415
Multithreading and even distributed computing seem to be the future, something Inkscape will face sooner or later. Since cleanup will be done now it seems natural to incorporate it into the plan (sooner).
If it enters the plan I may be able to bribe a guy who's experienced in that field to help :)

On Fri, Mar 14, 2008 at 08:40:38PM +0100, Mihaela wrote:
Bryce Harrington wrote:
What will the codebase cleanup work entail? The work will range from straightforward "grunt" work like making some simple code changes to all files in the codebase, to meatier work like abstracting widely used code into a more concise and powerful algorithm, to advanced work such as extracting distinct code into independent code packages.
To boil this down into five high level objectives:
- Complete some of the big architectural refactoring efforts
- Reduce source code line count
- Break useful code out into stand-alone libraries
- Increase code stylistic consistency
- Make the codebase more convenient to code in
Does this also mean keeping multithreading in mind? https://bugs.launchpad.net/inkscape/+bug/200415
Multithreading and even distributed computing seem to be the future, something Inkscape will face sooner or later. Since cleanup will be done now it seems natural to incorporate it into the plan (sooner).
Good point; yes this has come up in the past and iirc the conclusion was always that some significant refactoring would be needed to bring the code closer to being able to run in a threaded fashion. I don't recall specifics, but if we could take steps to close that gap, it could make it more feasible to achieve. Mental - any thoughts here?
If it enters the plan I may be able to bribe a guy who's experienced in that field to help :)
Sure, even if just to provide some review and advice, it'd be appreciated.
Bryce

On Fri, 2008-03-14 at 22:13 -0700, Bryce Harrington wrote:
Multithreading and even distributed computing seem to be the future, something Inkscape will face sooner or later. Since cleanup will be done now it seems natural to incorporate it into the plan (sooner).
Good point; yes this has come up in the past and iirc the conclusion was always that some significant refactoring would be needed to bring the code closer to being able to run in a threaded fashion. I don't recall specifics, but if we could take steps to close that gap, it could make it more feasible to achieve. Mental - any thoughts here?
The main thing is eliminating tight coupling between subsystems, after which point we can make individual subsystems threaded/threadsafe.
There are some secondary concerns with libraries as well; for example libgc does not work well with threads that were not created via its thread wrappers. The upcoming version of libgc (7.2?) will have some additional thread registration functions which can be used to address that issue, but it's not released yet and even once it is it'll be a while before it hits distributions.
-mental

Quoting MenTaLguY <mental@...3...>:
There are some secondary concerns with libraries as well; for example libgc does not work well with threads that were not created via its thread wrappers. The upcoming version of libgc (7.2?) will have some additional thread registration functions which can be used to address that issue, but it's not released yet and even once it is it'll be a while before it hits distributions.
Yes, libraries might give us some unexpected problems. Just in adding the newer tablet support I happened to run across all sorts of things that are in the gtk or gdk headers that appear to be disabled for thread-safe builds.
(I'm hoping that is just legacy defines... but it may be a lurking problem)

On Fri, 2008-03-14 at 20:40 +0100, Mihaela wrote:
Multithreading and even distributed computing seem to be the future, something Inkscape will face sooner or later. Since cleanup will be done now it seems natural to incorporate it into the plan (sooner).
It's just not feasible right now. We need to clean up the design and partition things a lot better, so we could make individual portions of the codebase threadsafe. Right now, everything touches everything else, and nothing is threadsafe.
-mental

MenTaLguY wrote:
On Fri, 2008-03-14 at 20:40 +0100, Mihaela wrote:
Multithreading and even distributed computing seem to be the future, something Inkscape will face sooner or later. Since cleanup will be done now it seems natural to incorporate it into the plan (sooner).
It's just not feasible right now. We need to clean up the design and partition things a lot better, so we could make individual portions of the codebase threadsafe. Right now, everything touches everything else, and nothing is threadsafe. The main thing is eliminating tight coupling between subsystems, after which point we can make individual subsystems threaded/threadsafe.
-mental
Yes, separating parts of code into pieces that are as independent as possible is the ground work for multithreading. Isn't that what 0.47 cleanup will (try to) encompass?
There's a thing about multiprocessing to keep in mind here; you don't need to have your entire code cleaned up and threadsafe. I think its fair to say no software can, you always have parts that can be split to work on more than one core, and parts that can't (this part can even be tangled spaghetti code).
It would be worthwhile to analyze the code and see what parts are appropriate for multithreading. Maybe you would only find 2 or 3 such places, but the speed increase on multicore machines could still be very significant.
The expert in multithreading I mentioned might help isnt familiar with Inkscape code and he'd need your input when it came to a specific case, but he thinks that heavy load code segments are potentially the best candidates for multicore implementation:
1. places with large loops 2. heavy maths formulas (matrix included) 3. graphics rendering (depending on its multithreading features)
Some of it will be a complex job, but its inevitable, it must be done and it definitely can be done, with a little bit of lateral thinking and a little bit of effort. As I mentioned before there are many levels of parallelization, it all depends on your features and code design.
In the new version of gcc OpenMP will be implemented (i think its in experimental phase now). With it the parallelization will be much easier, so it would also be very good to implement OpenMP into Inkscape.
If youd be able to find those heavy load segments and apply OpenMP on them, that would be HUGE progress.
The articles on the Intel site are good, you can check it out if you like http://softwarecommunity.intel.com/isn/home/MultiCore.aspx
jon@...18... wrote:
Yes, libraries might give us some unexpected problems. Just in adding the newer tablet support I happened to run across all sorts of things that are in the gtk or gdk headers that appear to be disabled for thread-safe builds.
(I'm hoping that is just legacy defines... but it may be a lurking problem)
Yes this problem needs to be addressed but its not a show stopper; one of the methods you can apply here is to lock the not_thread_safe library so that only one thread can access it.
Anyway, no matter when multicore/distributed processing actually gets implemented, it would be wise to start *planning* for it as soon as possible. As you all know making code threadsafe will only become more complicated as the code gets bigger.

On Sun, 2008-03-16 at 19:31 +0100, Mihaela wrote:
I think its fair to say no software can, you always have parts that can be split to work on more than one core, and parts that can't (this part can even be tangled spaghetti code).
There's still the infrastructural issues; for example, we have to be careful about libgc until we can rely on the new versions with better thread support being widely available.
It would be worthwhile to analyze the code and see what parts are appropriate for multithreading. Maybe you would only find 2 or 3 such places, but the speed increase on multicore machines could still be very significant.
I've done this to an extent, actually. The biggest potential win would be rendering, although we have to be careful since once we start taking advantage of hardware acceleration, rendering becomes more of an IO issue rather than a computation one. (more threads tends to make IO worse rather than better, since you saturate your IO bandwidth quickly and start to take hits from multiplexing)
The biggest problem right now is that the rendering/arena code reaches up vertically through nearly every Inkscape subsystem, in large part via its dependency on SPStyle. Until we break that dependency, I don't have much hope about parallelizing rendering generally. The exception may be the rendering of SVG filter effects, the inner loops of which are relatively isolated from the rest of the code. Filter effects are also the least likely to get hardware acceleration anytime soon, so we don't have to worry about that complication either.
Most of the rest of the codebase isn't really going to parallelize well, ever; a lot of things like document operations and transactions are rather inherently serial. The performance problems there are also often more a matter of doing too much unnecessary work than anything else.
A few threads could be applied to improve UI responsiveness, and I'd eventually like to do that, but we'd need to be very cautious about non-determinism. We've already had some very nasty problems from non-determinism introduced by people naively using Glib idle tasks or calling back into the event loop from an event handler; threads would be several orders of magnitude worse in that respect.
The expert in multithreading I mentioned might help isnt familiar with Inkscape code and he'd need your input when it came to a specific case,
Could you bring him into the thread, maybe?
In the new version of gcc OpenMP will be implemented (i think its in experimental phase now). With it the parallelization will be much easier, so it would also be very good to implement OpenMP into Inkscape.
I believe gcc has had some OpenMP support in released versions since 4.2. OpenMP is rather limited in what it can do for us, but it should be helpful for optimizing some computation-heavy loops (filter effects again?). Generally I would like to try to avoid the use of explicit threading when possible, and OpenMP certainly fits that bill.
-mental

On Sun, 2008-03-16 at 20:08 -0400, MenTaLguY wrote:
In the new version of gcc OpenMP will be implemented (i think its in experimental phase now). With it the parallelization will be much easier, so it would also be very good to implement OpenMP into Inkscape.
I believe gcc has had some OpenMP support in released versions since 4.2. OpenMP is rather limited in what it can do for us, but it should be helpful for optimizing some computation-heavy loops (filter effects again?). Generally I would like to try to avoid the use of explicit threading when possible, and OpenMP certainly fits that bill.
I'm curious if we wouldn't get more gain out of using something like liboil in these cases. The reality is that our filters are rather small overall. Probably something like MMX would get bigger gains that full-scale multiprocessing until we can go "full-multithreaded" across the codebase.
--Ted

On Mon, 17 Mar 2008 11:14:03 -0700, Ted Gould <ted@...11...> wrote:
I believe gcc has had some OpenMP support in released versions since 4.2. OpenMP is rather limited in what it can do for us, but it should be helpful for optimizing some computation-heavy loops (filter effects again?). Generally I would like to try to avoid the use of explicit threading when possible, and OpenMP certainly fits that bill.
I'm curious if we wouldn't get more gain out of using something like liboil in these cases. The reality is that our filters are rather small overall. Probably something like MMX would get bigger gains that full-scale multiprocessing until we can go "full-multithreaded" across the codebase.
I think OpenMP and liboil would be complimentary in this case, and filter rendering times are in fact a major pain point for users.
-mental
participants (17)
-
unknown@example.com
-
Aaron Spike
-
Bob Jamison
-
Bryce Harrington
-
bulia byak
-
Chris Lilley
-
jiho
-
Joel Holdsworth
-
John Bintz
-
john cliff
-
Jon A. Cruz
-
Maximilian Albert
-
MenTaLguY
-
Mihaela
-
Peter Moulder
-
Ted Gould
-
Terry Brown