[Php] Diff Parser And More

Many people will tell me that this is quite unnecessary, but if you’re willing to set up your own git server to manage your trinitycore fork, and if you want to make patches (assuming you’re not using github), this may be of some use.

http://pastebin.mozilla.org/1526520

And here is how to use this:

diffReader.php?diffFile=path/to/your/file&behavior=1

The solely interesting feature is to allow you to remove mode changes and diffs in paths you do not want to share. For now, that is a really basic feature (I made it for me to remove all changes done to the /dep/ folder, but I’m planning to make it handle regular expressions, ie. “/src/*”)

The basic usage is :

  1. Give the path to the original diff file (Which is generated by using git diff upstream/master…HEAD where upstream is poiting to TrinityCore’s repository on GitHub.

  2. Perform some operations on the diff (remove changes that are changing modes, filtering paths, etc)

  3. Specify an behavior (which is rather an output) and tada.

Using the ToHTML() method is really nice if you want some nice-looking view of your modified diff. It also uses a webkit specific function for you to be able to use Ctrl+A without worrying about having line indexes all over the clipboard’s content. As a consequence, it’s mainly designed for Google Chrome. If you happen to know whether such CSS functions do exist for other browsers, feel free to give me the tip.

This may be totally useless though, as git MAY be able to perform those changes, but i’m not enough used to it. Until then, this is a fairly good alternative.

Cheers.

P.S.: Note it can be used for any diff, not simply for trinitycore. But note that it doesn’t GENERATE diffs.

Updated the source with some bug fixes, code shrinking and methods description. First link updated.

Also added regexp support, no “real” check for them though.

Some update to save memory usage and cleanup.

removePath has been removed in favor of removePaths

<?php
class DiffParser
{
	private $lineMap = array();
	/*
	 * @description Parses the provided diff file.
	 *			  If $destructive is set to true, $this->lineMap will be destroyed after parsing is processed. Use it to save memory with huge diffs
	 * @author	  Warpten
	 */
	public function __construct(/* string */ $fileName, /* bool */ $destructive = false) {
		$handle = @fopen($fileName, 'r');
		if (!$handle)
			throw new Exception("Provided diff file is invalid, that is, either not found or not a real diff file.");
		while (($line = fgets($handle)) !== false)
			$this->lineMap[] = $line;
		fclose($handle);

		$this->_parse($destructive);
	}

	/*
	 * @description Go through the file's content and parse it.
	 * @author	  Warpten
	 * @return	  void
	 */
	private function _parse(/* bool */ $destructive) {
		$index = 0;
		$maximum = count($this->lineMap);
		while ($index < $maximum) {
			if (preg_match("#^diff --git (.+) (.+)$#i", $this->lineMap[$index], $files)) {
				$tempData = array();
				$tempData['fileBefore'] = trim($files[1]);
				$tempData['fileAfter']  = trim($files[2]);
				$index++;
				if (substr($this->lineMap[$index], 0, 9) == "old mode ") { // Mode change
					$tempData['oldMode'] = trim(substr($this->lineMap[$index], 9));
					if (substr($this->lineMap[$index], 0, 9) == "new mode ") {
						$tempData['newMode'] = trim(substr($this->lineMap[$index], 9));
						$index++;
					}
				}
				elseif (substr($this->lineMap[$index], 0, 14) == "new file mode ") { // File created
					$tempData['modeAfterAddition'] = trim(substr($this->lineMap[$index], 14));
					$index++;
				}
				elseif (substr($this->lineMap[$index], 0, 18) == "deleted file mode ") { // File deleted
					$tempData['modeBeforeDeletion'] = trim(substr($this->lineMap[$index], 18));
					$index++;
				}

				// Index info
				if (preg_match("#^index ([a-z0-9]+)\.\.([a-z0-9]+)#i", $this->lineMap[$index], $matches)){
					$tempData['oldIndex'] = $matches[1];
					$tempData['newIndex'] = $matches[2];
				}
				else {
					// No index means new diff. Insert data, and continue to next iteration.
					$this->diffList[] = $tempData;
					$index++;
					continue;
				}

				// Skip the next two lines, they just confirm mode changes (ie. file deletion or addition or nuttin).
				$index += 3; // 3 jumps to be on the next line to read
				// Loop until end of file or new diff
				while ($index < $maximum && substr($this->lineMap[$index], 0, 10) != "diff --git") {
					if (isset($tempData['diffBody']))
						$tempData['diffBody'] .= utf8_decode($this->lineMap[$index]);
					else
						$tempData['diffBody'] = utf8_decode($this->lineMap[$index]);
					$index++;
				}

				// Add diff to list
				$this->diffList[] = $tempData;
			}
			else
				$index++;
		}

		if ($destructive)
			unset($this->lineMap);
	}

	/*
	 * @description Remove diffs that correspond to mode changes ONLY
	 * @author	  Warpten
	 * @return	  $this
	 */
	public function removeModeChanges(/* void */)
	{
		foreach ($this->diffList as $index => $diffData)
			if (!isset($diffData['diffBody']))
				unset($this->diffList[$index]);
		return $this->remapTable($this->diffList);
	}
	/*
	 * @description Remove path(s) that match all RegExp provided through an array.
	 * @author	  Warpten
	 * @return	  $this
	 */
	public function removePaths(/* string */ $pathArray, /* bool */ $remap = true)
	{
		$table = $pathArray;
		$remQty = 0;
		if (!is_array($pathArray))
			$table = array($pathArray);
		foreach ($table as $i => $path)
		{
			if (!is_string($path))
				throw new Exception("Element " . $i .  " of the array passed to " . __FUNCTION__ . " should be a string, " . gettype($path) . " passed.");

			foreach ($this->diffList as $idx => $diffData)
			{
				if (preg_match($path, $diffData['fileBefore']) || preg_match($path, $diffData['fileAfter']))
				{
					unset($this->diffList[$idx]);
					++$remQty;
				}
			}
		}
		if ($remap && $remQty > 0)
			$this->remapTable($this->diffList);
		return $this;
	}

	/*
	 * @description Remap the provided array, breaks index association. Returns the array BY REFERENCE
	 * @author	  Warpten
	 * @return	  $this
	 */
	private function remapTable(/* array */ &$table) {
		$copy = array();
		foreach ($table as $entry)
			$copy[] = $entry;
		$table = $copy;
		return $this;
	}


	private function br2eol(/* string */ $text) { return str_replace('<br />', PHP_EOL, $text); }

	/*
	 * @description Fills tabs with &nbsp;s so that displaying tabs is not buggy
	 * @author	  Warpten
	 * @return	  Formatted string.
	 */
	private function tab2nbsp(/* string */ $text, /* bool */ $onlyTabs = true) {
		if (!$onlyTabs)
			return str_replace(' ', '&nbsp;', $text);
		if (preg_match("#^(([-|+| ]?)([ ]+))(.+)#", $text, $matches))
			$text = ($matches[2] <> ' ' ? $matches[2] : '') . str_repeat('&nbsp;', strlen($matches[3])) . ($matches[2] == ' ' ? '&nbsp;' : '') . substr($text, strlen($matches[1]));
		return $text;
	}

	/*
	 * @description Fix tags openers and closers that may bug out when printing HTML
	 * @author	  Warpten
	 * @return	  Formatted string
	 */
	private function fixTags(/* string */ $text) { return str_replace(array('<', '>'), array('&lt;', '&gt;'), $text); }

	/*
	 * @description Outputs data to a JSON-traversable object
	 * @author	  Warpten
	 * @return	  JSON
	 */
	public function ToJSON(/* void */) { return json_encode($this->diffList); }

	/*
	 * @description Outputs data to a raw var_dump call
	 */
	public function ToDump(/* void */) { return var_dump($this->diffList); }

	/*
	 * @description Output the data to a raw diff format.
	 *			  Useful after you removed some diffs from the original one.
	 * @author	  Warpten
	 * @return	  string
	 */
	public function ToDiff(/* void */) {
		$return = '';
		foreach ($this->diffList as $diffData) {
			$return .= 'diff --git ' . $diffData['fileBefore'] . ' ' . $diffData['fileAfter'] . PHP_EOL;
			if (isset($diffData['oldMode']) && isset($diffData['newMode']))
				$return .= "old mode " . $diffData['oldMode'] . PHP_EOL . 'new mode ' . $diffData['newMode'] . PHP_EOL;
			else if (isset($diffData['modeBeforeDeletion']))
				$return .= 'delete file mode ' . $diffData['modeBeforeDeletion'] . PHP_EOL;
			else if (isset($diffData['modeAfterAddition']))
				$return .= 'new file mode ' . $diffData['modeAfterAddition'] . PHP_EOL;

			if (isset($diffData['diffBody'])) {
				$return .= 'index ' . $diffData['oldIndex'] . '..' . $diffData['newIndex'] . PHP_EOL;

				if (isset($diffData['modeBeforeDeletion']))
					$return .= '--- ' . $diffData['modeBeforeDeletion'] . PHP_EOL . '+++ /dev/null' . PHP_EOL;
				else if (isset($diffData['modeAfterAddition']))
					$return .= '--- /dev/null' . PHP_EOL . '+++ ' . $diffData['modeAfterAddition'] . PHP_EOL;
				else
					$return .= '--- ' . $diffData['fileBefore'] . PHP_EOL . '+++ ' . $diffData['fileAfter'] . PHP_EOL;
			}
			$return .= $diffData['diffBody'];
		}
		return $return;
	}

	/*
	 * @description Output the data to a beautiful HTML-based output.
	 * @author	  Warpten
	 * @return	  string
	 */
	public function ToHTML(/* bool */ $withColors = true) {
		$html = '<table class="diff-holder">';
		$itr = 0;
		foreach ($this->diffList as $i => $diffData)
			$html .= $this->GetHTMLForDiff($i, $withColors, $itr);
		return $html . '</table>';
	}

	/*
	 * @description Get the HTML code (excepting the <table class="diff-holder"><tbody> tags) that corresponds to one of the file
	 *			  dumped for it to be printed through either direct call or ToHTML.
	 * @author	  Warpten
	 * @return	  string
	 */
	public function GetHTMLForDiff(/* int */ $i, /* bool */ $withColors = true, /* int */ &$itr = 1) {
		if ($i < 0)
			$i = count($this->diffList) + $i;
		if (!isset($this->diffList[$i]))
		{
			trigger_error("Couldn't find the diff at index #${i}, remapping the array. Unexpected results can be expected.", E_USER_NOTICE);
			$this->remapTable($this->diffList);
		}
		$diffData = $this->diffList[$i];
		$html = '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">diff --git ' . $diffData['fileBefore'] . ' ' . $diffData['fileAfter'] . '</td></tr>';
		$itr++;
		if (isset($diffData['oldMode']) && isset($diffData['newMode'])) {
			$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">old mode ' . $diffData['oldMode'] . '</td></tr>';
			$itr++;
			$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">new mode ' . $diffData['newMode'] . '</td></tr>';
			$itr++;
		}
		else if (isset($diffData['modeBeforeDeletion'])) {
			$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">deleted file mode ' . $diffData['modeBeforeDeletion'] . '</td></tr>';
			$itr++;
		}
		else if (isset($diffData['modeAfterAddition'])) {
			$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">deleted file mode ' . $diffData['modeAfterAddition'] . '</td></tr>';
			$itr++;
		}
		if (isset($diffData['diffBody'])) {
			$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">index ' . $diffData['oldIndex'] . '..' . $diffData['newIndex'] . '</td></tr>';
			$itr++;

			if (isset($diffData['modeBeforeDeletion'])) {
				$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">--- ' . $diffData['fileBefore'] . '</td></tr>';
				$itr++;
				$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">+++ /dev/null</td></tr>';
				$itr++;
			}
			else if (isset($diffData['modeAfterAddition'])) {
				$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">--- /dev/null</td></tr>';
				$itr++;
				$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">+++ ' . $diffData['fileAfter'] . '</td></tr>';
				$itr++;
			}
			else {
				$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">--- ' . $diffData['fileBefore'] . '</td></tr>';
				$itr++;
				$html .= '<tr><td class="diff-line-number">' . $itr . '</td><td class="diff-line-content">+++ ' . $diffData['fileAfter'] . '</td></tr>';
				$itr++;
			}
			$contentInLines = explode(PHP_EOL, $diffData['diffBody']);
			$contentInLines = array_slice($contentInLines, 0, count($contentInLines) - 1);

			foreach ($contentInLines as $bodyLine) {
				$html .= '<tr>';
				if ($withColors)
					$classExtra = (substr($bodyLine, 0, 1) == '+' ? ' plus' : (substr($bodyLine, 0, 1) == '-' ? ' minus' : ''));
				else
					$classExtra = '';
				$html .= '<td class="diff-line-number">' . $itr . '</td><td class="diff-line-content' . $classExtra. '">' . $this->fixTags($this->tab2nbsp($bodyLine, false)) . '</td></tr>';
				$itr++;
			}
		}
		return $html;
	}
}
/* ************************************************************************************************** */
/* ************************** End of POO class - Example output over there ************************** */
/* ************************************************************************************************** */
if (!isset($_GET['diffFile']))
	die("Please provide a path to your diff file using the `diffFile` GET parameter");
$behavior = isset($_GET['behavior']) ? $_GET['behavior'] : 0;
$diffParser = new DiffParser($_GET['diffFile'], true);
switch ($behavior)
{
	case 0: // To diff (No formatting)
		echo $diffParser->removeModeChanges()->removePaths('#dep#')->ToDiff();
		break;
	case 1: // To HTML
		// This is some cheesy style derived from GitHub's diff viewer - Make your own.
		echo '<!DOCTYPE html>';
		echo '<head><style type="text/css">';
		echo '.diff-holder { border-collapse: collapse; vertical-align: top; width: 100%; }';
		echo '.diff-line-content { padding: 0px 5px; font-family: Consolas; font-size: 13px; line-height: 16px; background-color: #F8F8FF; }';
		echo '.diff-line-number { text-align: right; color: gray; word-break: normal; white-space: nowrap; font-size: 13px; font-family: Consolas; border-right: 1px solid #BBB; background-color: #F0F0F0; padding-right: 5px; line-height: 16px; -webkit-user-select: none; width: auto; padding-left: 5px; }';
		echo '.minus { background-color: #FDD; }';
		echo '.plus { background-color: #DFD; }';
		echo '* { margin: 0; padding: 0; }';
		echo '</style></head><body>';
		if (!isset($_GET['getDiffById'])) { // Get the whole diff
			echo $diffParser->removeModeChanges()->removePaths(array('#\/dep\/#', '#\/sql\/#'))->ToHTML(isset($_GET['coloured']));
		}
		else { // Get one diff file		  
			echo '<table class="diff-holder">';
			echo $diffParser->removeModeChanges()->removePaths('#\/sql\/#i')->GetHTMLForDiff($_GET['getDiffById'], isset($_GET['coloured']));
			echo '</table>';
		}
		echo '</body></html>';
		break;
	case 2: // To JSON
		echo $diffParser->removeModeChanges()->removePaths('#dep#')->ToJSON();
		break;
	default: // Unreachable in this example though.
		echo "Provided behavior is unexpected, it must range from 0 to 2 included.";
}

[/CODE]