GSoC-16 "Multiversion"-Coding period - Week 3

Submitted by rakesh on Tue, 06/14/2016 - 18:31

Background Information: We needed a PHP Library for our project which could return the Lowest Common Ancestor (LCA) from a Directed Acyclic Graph (DAG). This LCA would act as a base node for our Remote and Local nodes and the nodes will be compared with this LCA to detect a conflict otherwise merge. For more details about the project, please refer to this page. Let's take a look at the detailed description of the problem as well as the approach we are working on. It will very much clear the need of LCA and the Libraries we've been working on.

Project Detail and Problem: With the introduction of Multiversion module in Drupal 8, we suddenly have a very powerful content revision API that can handle branching and conflict DETECTION. But there is not yet a way to SOLVE revision conflicts.

How would merge conflicts occur in Drupal: The Workspace module is the module people use for creating new versions/branches of their site.The Drupal 8 version of Workspace depends on Multiversion to create and edit workspaces, switch between workspaces,it’s branches and view the revision tree of an entity. And users can push content between workspaces with the Deploy module as the Workspace module itself doesn't provide any functionality for pushing content between workspaces. If two users working on different workspaces modify the same content in any entity and try to push their revisions on another branch, the system wouldn’t know which changes are to be kept and which to discard. This is how Merge conflicts may occur in Drupal and needs to be solved manually by user.

Solution: A PHP library would be created that would perform a recursive 3 way merge function to merge the revisions in the base entities from the updated remote entity or local entity. It would also resolve merge conflicts (in case any occurs) and would be integrated with the plugin code.

There are 2 ways to merge the changes into the base entity:

  •  Manual - In case, merge conflicts occurs.
  • Automatic: The system would automatically detect the changes to keep and would merge them with base entity. As we are planning to achieve the automatic merge functionality using recursive 3 way merge, We would need:
  1. Source/Remote entity
  2. Base entity
  3. Destination/Local entity

Recursive 3 way merge: It turns a manual conflict into an automatic resolution. Part of the magic here relies on the VCS locating the original version of the entity (Base entity). This original version is better known as the "Lowest Common Ancestor (LCA)". The VCS then passes the common ancestor and the two contributors to the three-way merge tool that will use all three to calculate the result. The tool compares the local and remote entities with the LCA (base entity) and updates the base entity according to the modified entity. Example given below:

Merge conflicts don't occur if and only if:

  1. Only one entity (Either remote or Local) has been modified. In that case, base entity is modified according to the updated entity and all new modification are merged in the base entity. For example:

NoConflict

 

As we can see in Figure 2, only one developer actually changed line 30, so the conflict can be automatically resolved: Just keep "Yours" as the solution, so it will say: Print ("hello");

  1. Both entities  (Remote and Local) are edited but at different index such that no common lines are edited in both the entities. For ex: Addition is done in remote entity from line 8 to 11 and lines 17-19 has been removed in local entity. In that case, addition will be made in base entity on line 8-11 and removed lines from local entity would be removed as well from their respective position in modified base entity.                                    

Conflict

 

But if in any case both the entity (Remote and Local) has been edited, a merge conflict would occur and would have to be resolved manually as VCS can not decide which changes are to be kept. Once the conflicts are resolved, merging can be completed.

Targets achieved in past weeks: In our first 2 weeks of coding period, we have completed our PHP Library to find the Lowest common ancestor of two nodes from a Directed acyclic graph. Detailed description of the progress and the PHP Library to find LCA can be found at GSoC-2k16 Coding period - Week 1 and GSoC-2k16 Coding period - Week 2 . We have also implemented tests for basic graphs.

Work done this week: This week, we've implemented 3 tasks:

  • More tests for PHP Library to find LCA with much more complex graph and structures.
  • Implemented the PHP Library to perform recursive 3 way merge algorithm on the nodes otherwise throw an exception.
  • Tests for PHP Library to perform 3 way merge algorithm with different type of arrays, key value pairs and conflicts.

Let's start with the first task of this week i.e, writing tests for complex graphs. For example,

  • Graph with nodes having multiple parents.

  • Graph with nodes having no parents.

  • Graph having an edge to itself (Recursive traversal).

It was a little difficult for me in the starting but now I am getting familiar with the actual software development. I came to know it's not only about coding, you must follow the standards too. My mentors Tim Millwood, Andrei Jechiu and Dick Olsson being really supportive has been teaching me a lot about it like "Adding appropriate docs and comments", "Following PSR-2 standards strictly" and many other small things for which I believe I'd have take a lot of time to  learn on my own.

Other than the test cases, we have added dependencies in "composer.json" file which would make easy for users to install dependencies. We have also added PSR-2 check for travis in ".travis.yml" file, so that no builds with standard errors would pass.

The code to find the Lowest Common Ancestor (LCA) has been merged into the relaxedws/lca repository this week. However, there are a few things which we have to still work on like recursive LCA finding algorithm which would find a common LCA in case more than one LCA is found.

Along with all this, we started our work with the Merge Library which would be used to compare and merge changes. For those who are new to the concept, Let me explain it again.

In the First Library we created which would to used to find the LCA, It'd find us the Base node or Parent node with which we'd compare our remote and local nodes and decide how the merging should be done.

This is what our second task was about, a library to perform merging based on the changes. We have successfully implemented that as well. It'd check what changes have been made and would compare them with other nodes and result a new node with appropriate changes in that. It follows the approach defined as:

  • It would go through every key-value pair in parent node and check if it is same as local node; If yes, it'd merge remotes changes into the new node (array).

  • If the parent node is not identical to local node but to remote node; It'd merge the local's value for that key in the new node (array)

  • If both the local and remote are identical but different from the parent node, the new node (array) can have any one's data copied to it. We are using the local node's data.

  • If none of the above are true, That signifies that all the 3 nodes have a different value for some key and hence a conflict would occur which would throw an exception and would have to be handled manually.

We have successfully implemented all this code along with some tests for it. Tests cases covers the case where:

  1. There is no conflict and merging can be done easily.

  2. There is conflict and exception needs to be thrown.

  3. The keys are identical but in different order.

  4. They key value pairs are recursive, for example: A key containing a array whose nodes might contain another array.

Along with these files, we have added "composer.json" with test and phpunit dependencies and .travis.yml for PSR-2 checks in this library as well. Complete code can be found at sharprakeshverma/merge repository.

This week, we'd be working on more intensive test case, we need A LOT more test cases so that we could be sure that it is working just fine and won't return false conflicts or merge wrong nodes as it may result in drastic errors.

Image Credits: DrDobbs