- 17 Apr, 2014 20 commits
-
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
The UniqueUnion.jar file can run from the bin directory. Double-clicked it there and it found the source files, generated new data, and deposited in the 'data/2014-04-17' directory.
-
jcolosi authored
UniqueUnion now finds the 'source' files in the root, reads as UTF-8, converts from U-label to A-label, and then generates the merged file. All data is deposited in the 'data' directory and labeled for the day of creation.
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
jcolosi authored
-
- 09 Apr, 2014 1 commit
-
-
Gavin Brown authored
-
- 27 Feb, 2014 3 commits
-
-
Gavin Brown authored
Duplicate entries
-
mike authored
-
mike authored
-
- 16 Jan, 2014 1 commit
-
-
Gavin Brown authored
-
- 13 Jan, 2014 1 commit
-
-
Gavin Brown authored
Adding a Java program to merge, sort, normalize, and make unique.
-
- 10 Jan, 2014 2 commits
-
-
John Colosi authored
- Moving UniqueUnion.jar into the bin directory. - Created a data/A-label directory and added A-label conversions for each section's file. - Created a data/U-label directory but left this empty for now. Files at the root could be moved here, or some other action could be taken. - Added a final directory to store files with a unique union of Reserved Names. Added the unique union of A-labels to this directory.
-
John Colosi authored
Updated the toLowerCaseAscii routine to fix a bug. Also changed the handling of comment characters to prevent duplicates in certain situations.
-
- 09 Jan, 2014 1 commit
-
-
John Colosi authored
As part of normalization, I was using Java's String.toLowerCase() method. This changes utf-8 values, corrupting the encoding.
-
- 08 Jan, 2014 1 commit
-
-
John Colosi authored
Adding a Java program to merge, sort, normalize, and make unique, all the .txt files in a particular directory.
-
- 28 Nov, 2013 3 commits
-
-
Gavin Brown authored
-
Gavin Brown authored
-
Gavin Brown authored
-
- 27 Nov, 2013 7 commits
-
-
Gavin Brown authored
-
Gavin Brown authored
-
Gavin Brown authored
-
Gavin Brown authored
-
Gavin Brown authored
You need the Net_IDNA2 script from PEAR.
-
Gavin Brown authored
-
Gavin Brown authored
-