summaryrefslogtreecommitdiff
path: root/src/corefx
AgeCommit message (Collapse)AuthorFilesLines
2016-02-08Revert "Add un-prefixed signatures as temporary workaround"Jan Kotas2-36/+0
This reverts commit fb80bad2ed19970472ddefe539520abef42a52d0.
2016-02-02TimeZoneInfo.DisplayName values are not localized on LinuxEric Erhardt3-45/+84
Fixed by calling ICU's ucal_getTimeZoneDisplayName to read the display names for the current locale. Fix https://github.com/dotnet/corefx/issues/2748
2016-01-30Revert "Revert "Add un-prefixed signatures as temporary workaround""Jan Kotas2-0/+36
2016-01-30Revert "Add un-prefixed signatures as temporary workaround"Jan Kotas2-36/+0
This reverts commit 6b1d2938ec4a5a2c64fd849797ec7800ed3ab575.
2016-01-29Add un-prefixed signatures as temporary workaroundJan Kotas2-0/+36
2016-01-29Unique names for GlobalizationNative exportsJan Kotas10-166/+168
For consistency and to enable eventual sharing of the same code with CoreRT, I have changed the naming convention for System.Globalization.Native exports to match dotnet/corefx#4818.
2016-01-27Update license headersdotnet-bot11-44/+33
2016-01-13IdnMapping GetUnicode conformance tests failEric Erhardt1-1/+2
The IdnaConformanceTests fail on Unix because \u00DF, \u200C and \u200D characters are not being handled as specified in the http://www.unicode.org/Public/idna/6.0.0/IdnaTest.txt file. The fix is to use UIDNA_NONTRANSITIONAL_TO_UNICODE and UIDNA_CHECK_CONTEXTJ options when calling uidna_openUTS46. Partial fix for https://github.com/dotnet/corefx/issues/3406.
2016-01-11CompareOptions.IgnoreSymbols only ignores punctuation on Unix, but not other ↵Eric Erhardt4-2/+33
symbols By default, ICU alternate shifted collation handling only ignores punctuation, not all symbols, so change the "variable top" to include all symbols and currency characters. Fix #4907
2016-01-08Convert System.Globalization.Native to use a configure.cmake and .h.in files.Eric Erhardt4-17/+17
2015-12-24GC OS interface refactoringJan Vorlicek1-0/+1
This change replaces all calls of OS specific functions in the GC by a call to a platform agnostic interface. Critical sections were abstracted too. The logging file access was changed to use CRT functions instead of Windows specific APIs. A "size" member was added to the card_table_info so that we can pass the right size to the VirtualRelease method when destroying the card table. I have also fixed a bug in the gc_heap::make_card_table error path where when VirtualCommit failed, it called VirtualRelease with size that was not the reserved size, but the committed size. Other related changes - All interlocked operations moved to Interlocked class as static methods - Removed unused function prototypes - Shuffled stuff in the root CMakeLists.txt to enable building the GC sample using the settings inherited from the root CMakeLists.txt and to clean up some things that have rotted over time, like the FEATURE_xxx macros not being in one alphabetically ordered block - Fixed the VOLATILE_MEMORY_BARRIER macro in the gcenv.base.h - Replaced uint32_t thread id by EEThreadId - Removed thread handles storage (g_gc_thread) from the GC. The thread handle is closed right after the thread is launched. That allowed me to get rid of the GCThreadHandle - Renamed the methods of the EEThreadId to be easier to understand - Moved the gcenv.windows.cpp and gcenv.unix.cpp to the sample folder
2015-12-09Fixing collation for the following scenarios:Eric Erhardt1-29/+56
1. When IgnoreSymbols is true, ensure we still ignore half and fullwidth characters that are symbols. 2. Hiragana-Katakana characters differ at the tertiary strength, fixing the rule. 3. Fix collation on OSX which uses ICU 55.1. ICU 55 doesn't support having certain unicode characters using primary '<' rules. These characters are not necessary in the rules, since Windows always treats them the same. Removing 0x3099 and 0x309A from the half/full width rules.
2015-12-07Address PR feedback.Eric Erhardt1-21/+28
2015-12-07Ensure the collator map is thread safe on SortHandle.Eric Erhardt1-0/+9
2015-12-07Adding support for String CompareOptions IgnoreKanaType and IgnoreWidth on Unix.Eric Erhardt1-3/+160
2015-12-07Adding support for String CompareOptions IgnoreNonSpace and IgnoreSymbols.Eric Erhardt1-5/+33
2015-12-07Convert SortHandle to use a std::map to cache UCollators per option that is ↵Eric Erhardt1-19/+36
passed in. This is in preparation of creating different UCollators for each option.
2015-11-20Merge pull request #2112 from ellismg/cache-ucollatorsStephen Toub1-31/+78
Cache UCollators in a Locale
2015-11-20Cache UCollators in CompareInfoMatt Ellis1-31/+78
Creating a UCollator is an expensive operation and we are presently doing it on ever collation operation. We can improve this by caching the UCollators we use for collation on the CompareInfo object itself. This change introduces a new method GetSortHandle which gives back an opaque wrapper which can be used in collation operations instead of a culture name. Internally we represent this is a struct holding the two types of UCollators we care about (if we add additional collators per locale with different options to handle other types of CompareOption flags, we can cache these as well). Collation methods can get a `const UCollator*` reference from the sort handle which is safe to share across threads (per the ICU Design Guidelines[1]). Unfortunately, tracking the lifetime of the SortHandle itself is not as straightfoward as I would like. Right now, we use a SafeHandle to wrap the internal handle and rely on the finalizer of the class to clean up the native resources. However this means that the following code sample will create two finalizable objects: ```csharp var c1 = new CultureInfo("en-US").CompareInfo; var c2 = new CultureInfo("en-US").CompareInfo; ``` If this ends up being an issue, we could explore an approach where we keep a cahce of SortHandles in managed code and pass out references to that SortHandle which would let us share a single SortHandle for a given locale across more than one CompareInfo object. Wins are seeing in places where we previously did lots of string comparisions in a tight loop (for example: dotnet/corefx#3811) moving these operations down to ~6ms per iteration vs ~330ms on my local machine. [1]: http://userguide.icu-project.org/design
2015-11-19Merge pull request #2047 from eerhardt/ShortDateEric Erhardt1-5/+4
DateTimeFormat.ShortDatePattern should use CLDR 'short' format on Unix.
2015-11-15Add USEARCH_DONE check to StartsWith in System.Globalization.Nativestephentoub1-22/+24
The StartsWith ICU wrapper was not checking the result of usearch_first to see if it was USEARCH_DONE, indicating no match found. This has two ramifications: 1. When there isn't a match, USEARCH_DONE (-1) gets passed in as the textLength argument to ucol_openElements, which treats -1 as meaning the string isn't null-terminated, and thus ends up walking the string looking for non-ignorable collation elements. Our tests have been passing because they've been using strings containing only non-ignorable elements, and thus the first character checked causes us to bail and correctly return false. If nothing else, this is an unnecessary perf overhead. 2. But on top of that if there are only ignorable collation elements before the first null character in the string (e.g. if the string begins with a null character), then because we told ICU that the string ended at the first null character, it'll stop walking the string and return a match. e.g. "\0bug".StartsWith("test") returns true incorrectly. This commit simply adds a check for USEARCH_DONE to StartsWith. EndsWith already has such a check.
2015-11-13DateTimeFormat.ShortDatePattern should use CLDR 'short' format on Unix.Eric Erhardt1-5/+4
The DateTimeFormat.ShortDatePattern is currently defaulting to using CLDR's 'yMd' skeleton. However, this value doesn't produce the best format for all cultures, ex. "de-DE". LongDatePattern uses CLDR's 'full' format. To be symmetrical, the ShortDatePattern should be using CLDR's 'short' format. Fix https://github.com/dotnet/coreclr/issues/1736.
2015-11-13Pass target string lengths to ICU on Unixstephentoub1-8/+8
Our current ICU shims for StartsWith, EndsWith, IndexOf, and LastIndexOf take the length of the source string but not the length of the target string. This forces ICU to compute the length of the string by searching for a null terminator. We can save those costs and be more accurate around nulls in the target string by passing the known length in.
2015-10-25Merge pull request #1851 from ellismg/icu-remove-c-plus-plusMatt Ellis7-502/+676
Remove OSX Homebrew ICU dependency
2015-10-23Cleanup CMakeLists.txtMatt Ellis2-20/+27
There were a few problems that needed to be addressed: - Our detection logic around testing if ICU supported a feature was still checking for C++ stuff instead of the coresponding C code (which we ended up using). - There was some cleanup we could do now that the OSX and other Unix builds were split apart
2015-10-23Fix spelling issuesMatt Ellis1-2/+2
2015-10-23Use correct close function for UNumberFormatMatt Ellis1-1/+1
2015-10-22Use std::vector instead of callocMatt Ellis1-18/+6
This matches what we do in other places in calendarData.cpp, the RAII pattern will make it easier to not leak memory.
2015-10-22Link against libicucore on OSXMatt Ellis1-14/+31
OSX ships with a copy of ICU (their globalization APIs are built on top of it). Since we only use stable c based APIs, we can link against it using the methods described in using a System ICU in the ICU User's Guide (basically we disable function renaming, don't use C++ and only use stable APIs). The ICU headers are not part of the SDK, so we continue to need ICU installed via Homebrew as a build time dependency. Fixes dotnet/corefx#3849
2015-10-22Update IndexOfOrdinalIgnoreCase to use full code unitsstephentoub1-62/+57
2015-10-22Improve string.{Last}IndexOf perf on Unix for Ordinal/OrdinalIgnoreCasestephentoub1-0/+78
Our current implementation of IndexOfOrdinal for strings on Unix uses Substring to get the piece of the source string we care about; this results in an unnecessary allocation / string copy. When using OrdinalIgnoreCase, we also convert both the source and search strings to upper-case using ToUpperInvariant, resulting in more allocations. And our LastIndexOfOrdinal implementation delegates to IndexOfOrdinal repeatedly, incurring such allocations potentially multiple times. This change reimplements Ordinal searching in managed code to not use Substring, and it implements OrdinalIgnoreCase searching via new functions exposed in the native globalization shim, so as to use ICU without having to make managed/native transitions for each character. With the changes, {Last}IndexOf with Ordinal/OrdinalIgnoreCase are now allocateion-free (as you'd expect), and throughput when startIndex/count and/or OrdinalIgnoreCase are used is increased significantly, on my machine anywhere from 20% to 3x, depending on the inputs.
2015-10-21Hygine cleanups in holders.hMatt Ellis1-6/+2
Use "= delete" syntax to make it clear the IcuHolder copy constructor and assignment opperators are removed. Remove superfluous "public" modifier on the struct closers used by the IcuHolders.
2015-10-21Cleanup include directivesMatt Ellis5-28/+9
2015-10-21Remove use of icu::Locale C++ typeMatt Ellis5-128/+179
Remove all the uses of the icu::Locale type in favor of just using a char* which is the raw locale id (which is exactly what all the ICU C apis use for a locale). The meat of this change si in locale.cpp to actually handle doing the conversion from UChar* to char*. The rest of the places are dealing with the fallout (GetLocale now has a different signiture and the .getName() dance is no longer needed as we have a raw locale name all the time now).
2015-10-21Move off Locale instance methodsMatt Ellis3-93/+138
To prepare for removing icu::Locale in favor of just using the id directly, remove all the uses of Locale methods except for .getName(). We now use GetLocale to create a Locale but then turn it into a char* for all the helper methods. After this change, we can update GetLocale to do locale parsing into a char buffer and remove all the locale.getName() calls with just `locale'.
2015-10-21Remove C++ Locale use in EnumSymbolsMatt Ellis1-3/+4
2015-10-21Get Eras using ICU C API instead of C++Matt Ellis2-49/+93
Getting the regular eras is straight forward, we can do the thing we do for other locale data and just ask ICU using a specific UDateFormatSymbolType. For abbreviated eras, there's no C API, but we can try to just read the data from ICU resources and fall back to the standard width eras if that doesn't work.
2015-10-21Remove use of ICU C++ DateFormatSymbolsMatt Ellis1-58/+49
2015-10-21Remove useage of ICU C++ DecimalFormatSymbolsMatt Ellis1-40/+24
2015-10-21Remove ICU C++ LocaleDisplayNamesMatt Ellis2-6/+14
2015-10-21Remove use of ICU C++ NumberFormat classMatt Ellis2-65/+82
This change removes NumberFormat in favor of UNumberFormat. There is a bit of work that needs to happen in order to keep the normalization code we use to convert an ICU pattern so to something we can match against working. Instead of UnicodeStrings, the input to the normalization function is now a UChar* and we build up a std::string during normalization. This allows us to also skip a conversion from UChar* back to char* so we can find the correct pattern in our collection of patterns to examine.
2015-10-21Remove use of ICU C++ DateFormatSymbolsMatt Ellis1-14/+4
2015-10-21Convert DateFormat to UDateFormatMatt Ellis3-83/+46
Remove uses of the C++ DateFormat and SimpleDateFormat classes in favor of UDateFormat. As part of this change, it was easier to move some of the code that converts an ICU format string to a .NET Style format string from native code up to managed code. This code used UnicodeString and we'll need to move away from that as well as we remove all the C++ usage.
2015-10-21Convert DateTimePatternGenerator usage to CMatt Ellis2-13/+40
Part of the effort to remove our usage of C++ ICU APIs. The major issue here was that the C++ API used char*'s for some things whereas the C API used UChar*'s so we needed to define our own copies of some constants. We also need to manage a buffer ourselves, instead of being able to use the underlying buffer of a retured UnicodeString.
2015-10-21Remove use of ICU C++ Calendar classMatt Ellis3-26/+90
We would like to be able to link against versions of ICU installed as a "operating system level library" which means we can't take a dependency on any C++ APIs. This change moves away from icu::Calendar in favor of UCalendar. I also introduce a small helper template to manage the lifetime of ICU resources.
2015-10-19Fix Turkish i casing with invariant culturestephentoub1-74/+127
It turns out there are some differences in the casing data used by Windows and ICU for our invariant culture. In particular, Turkish 'i' is handled differently. This change splits the shim ChangeCase function into three: ChangeCaseTurkish (used for Turkish culture), ChangeCaseInvariant (used for Invariant culture), and ChangeCase (used for everything else. ChangeCaseInvariant includes the new special cases. Call sites in the managed code are localized to a single function that determines which native function to invoke.
2015-10-13Run format-code.shMatt Ellis10-1204/+1329
2015-10-13Add format-code.sh from CoreFXMatt Ellis2-0/+58
2015-10-09Add support for obtaining default locale in Linux and fix issue with ↵Steve Harter2-7/+42
@collation= not being passed to ICU
2015-10-08Improve string.ToLower/ToUpper perfstephentoub1-25/+45
Our Unix implementation of changing case is currently slower than our implementation on Windows. In our ChangeCase implementation in System.Globalization.Native that uses ICU, we have a loop that reads each code point, processes it, and writes out the result. That processing involves branching into four cases based on whether we're going to upper or to lower, and whether we're using Turkish or not. By manually hoisting the invariants and loop cloning in order to remove the branches from the inner loop, we can improve the performance of this routine by ~15-20%.