Finding Duplicate Strings
This guide demonstrates finding duplicate string using RhinoScript.
Imagine you have an array of strings which contains duplicates. RhinoScript has a method to cull the duplicate strings. But rather than cull them, one would like to find them with a routine that will return the indices of the duplicate items.
Better yet, a routine that would return sets of indices, with each set containing the indices of a particular string would solve the problem. For example, if an array contained “Curve”, “Surface”, “curve”, “surface”, we would like to have an array containing [0,2] and [1,3] returned.
Dictionary is a useful tool for storing associative data, or data in the form of (key, item) pairs. In the problem outlined above, you could use a Dictionary to track each string and the indices where is appears in the array. In other words, use a Dictionary to store (string, indices) pairs.
To store the indices, you are going to need an array. But creating and resizing VBScript arrays is always a challenge. So, you might consider using a .NET’s
ArrayList object. A .NET
ArrayList is a COM-enabled object, which means it can be used in VBScript.
The following sample function demonstrates how you can use a
Dictionary of strings and .NET
ArrayList objects to find the indices of duplicate string items in an array.
Function FindDuplicateStrings(arrStrings, blnCase) ' Local variables Dim objDict, strKey, objItem, arrItems Dim i, j, nCount Dim arrResults() ' Default return value FindDuplicateStrings = Null ' Create a dictionary object and set it's compare mode Set objDict = CreateObject("Scripting.Dictionary") If (blnCase = True) Then objDict.CompareMode = vbBinaryCompare Else objDict.CompareMode = vbTextCompare End If ' Process input strings. If the string is not in the dictionary, ' then add it and add it's index to the ArrayList. Otherwise, ' just add it's index to the dictionary item's existing ArrayList. For i = 0 To UBound(arrStrings) strKey = arrStrings(i) If Not objDict.Exists(strKey) Then objDict.Add strKey, CreateObject("System.Collections.ArrayList") End If objDict(strKey).Add(i) Next ' Find all of the dictionary items that have more than one index. ' Add those arrays to our result array nCount = 0 arrItems = objDict.Items For Each objItem In arrItems If (objItem.Count > 1) Then ReDim Preserve arrResults(nCount) arrResults(nCount) = objItem.ToArray() nCount = nCount + 1 End If Next ' Done! FindDuplicateStrings = arrResults End Function
Here is an example of how to use the above function:
Sub TestFindDuplicateStrings Dim arrStrings, arrResults, arrItem, nItem, i arrStrings = Array("Curve" , _ "Surface", _ "Mesh", _ "Point", _ "Surface", _ "Curve", _ "Curve") arrResults = FindDuplicateStrings(arrStrings, False) If IsArray(arrResults) Then Call Rhino.Print("Duplicate Sets = " & CStr(UBound(arrResults) + 1)) For i = 0 To UBound(arrResults) Call Rhino.Print("Set = " & CStr(i + 1)) arrItem = arrResults(i) For Each nItem In arrItem Call Rhino.Print("Item " & CStr(nItem) & " = " & arrStrings(nItem)) Next Next End If End Sub