Mailing List Archive

[GitHub] [lucene-jira-archive] mocobeta commented on issue #1: Fix markup conversion error
mocobeta commented on issue #1:
URL: https://github.com/apache/lucene-jira-archive/issues/1#issuecomment-1169860211

Thanks for reporting.

I found Jira's number list (`#`) is not correctly converted and it is interpreted as headers in Markdown.

Jira dump
```
"body": "I'm definitely not an expert on this but after some research I found:\r\n # The real problem probably is we're assuming object alignment in 32 bit jvm is 4 bytes but they're actually default into 8 bytes in HotSpot JVM and can't be anything less than 8 bytes ([https://stackoverflow.com/questions/44468639/memory-alignment-of-java-classes)]\r\n # Object header may create offset for object alignment, like in your jol analysis, the header is 12 bytes long and thus created a 12%8=4 bytes offset, so that the target array size should cover those and that's why for {{byte[]}} 4,12,20... sizes are optimal, but I\u00a0*think* the header length can vary depend on either jvm or system, since I've seen some post with 2 mark words in the header which makes header 16 bytes\r\n\r\nSo there should be something we could optimize here, but probably need to figure out a way to identify how many bytes are in array header, ah [RamUsageEstimator|https://github.com/apache/lucene/blob/main/lucene
/core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L179,L187] listed the details out, the 64 bit machine's header is already aligned so we don't need to worry about the offset, and 32 bit machine's header is constant 12 bytes so with a 4 bytes offset.",
```

Converted markdown data
```
"body": "I'm definitely not an expert on this but after some research I found:\r\n # The real problem probably is we're assuming object alignment in 32 bit jvm is 4 bytes but they're actually default into 8 bytes in HotSpot JVM and can't be anything less than 8 bytes (<https://stackoverflow.com/questions/44468639/memory-alignment-of-java-classes)>\r\n # Object header may create offset for object alignment, like in your jol analysis, the header is 12 bytes long and thus created a 12%8=4 bytes offset, so that the target array size should cover those and that's why for `byte[]` 4,12,20... sizes are optimal, but I\u00a0**think** the header length can vary depend on either jvm or system, since I've seen some post with 2 mark words in the header which makes header 16 bytes\r\n\r\nSo there should be something we could optimize here, but probably need to figure out a way to identify how many bytes are in array header, ah [RamUsageEstimator](https://github.com/apache/lucene/blob/main/lucen
e/core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L179,L187) listed the details out, the 64 bit machine's header is already aligned so we don't need to worry about the offset, and 32 bit machine's header is constant 12 bytes so with a 4 bytes offset.\n\nAuthor: Patrick Zhai (`@zhaih`)\nCreated: 2022-06-09T07:07:05.021+0000\nUpdated: 2022-06-09T07:07:05.021+0000\n",
```



--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org