Mailing List Archive: [RESULT] [VOTE] Migration to GitHub issue from Jira

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

Jun 17, 2022, 5:26 PM

Post #26 of 40 (972 views)

I don't intend to neglect histories in Jira... it's an important,
valuable asset for all of us and possible contributors in the future.

It's important, *therefore*, I don't want to have the degraded copies
of them on GitHub.
We cannot preserve all of history - again, there should be tons of
unignorable information losses (timestamp, reporter, assignee,
markdown, metadata that cannot be ported to GitHub) if we attempt to
migrate the whole Jira history into Github. Rather than trying to have
such incomplete copies, I would preserve Jira issues in the perfectly
archived status, then simply refer to them.

Tomoko

2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
>
> I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
>
> I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
>
> What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
>
> So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
>
> For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
>
> Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
>
> WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
>
> -Gus
>
> On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
>>
>> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>> >
>> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
>> >
>>
>> We already have a ton of github "issues" (pull requests, since PRs are issues).
>> If you want to "back them up", its easy, you can paginate thru them
>> 100 at a time, e.g. run this command, incrementing 'page' until it
>> returns empty list:
>>
>> curl -H "Accept: application/vnd.github.v3+json"
>> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
>> > file1.json
>>
>> Yeah of course if you want to backup the comments and stuff, you'll
>> need to do more.
>> But it is already the case today, that a ton of this "history" is
>> already in github issues, as PRs. Most recent JIRAs are just useless
>> placeholders.
>> Also the same risks apply to JIRA, except are not theoretical and real
>> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
>> to sucker you into their "Atlassian Cloud":
>> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

tomoko.uchida.1111 at gmail

Jun 17, 2022, 6:11 PM

Post #27 of 40 (972 views)

Permalink

I feel like we should delay the decision on the mingration of existing
issues until we have a clear image of what can be done and what cannot
be done.

I'll write some migration script that preserves the issue history as
far as possible - then come back here with some experience.
Let's make a decision upon the concrete knowledge and information.

Tomoko

2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>
> I don't intend to neglect histories in Jira... it's an important,
> valuable asset for all of us and possible contributors in the future.
>
> It's important, *therefore*, I don't want to have the degraded copies
> of them on GitHub.
> We cannot preserve all of history - again, there should be tons of
> unignorable information losses (timestamp, reporter, assignee,
> markdown, metadata that cannot be ported to GitHub) if we attempt to
> migrate the whole Jira history into Github. Rather than trying to have
> such incomplete copies, I would preserve Jira issues in the perfectly
> archived status, then simply refer to them.
>
> Tomoko
>
> 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
> >
> > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
> >
> > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
> >
> > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
> >
> > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
> >
> > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
> >
> > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
> >
> > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
> >
> > -Gus
> >
> > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
> >>
> >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
> >> >
> >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
> >> >
> >>
> >> We already have a ton of github "issues" (pull requests, since PRs are issues).
> >> If you want to "back them up", its easy, you can paginate thru them
> >> 100 at a time, e.g. run this command, incrementing 'page' until it
> >> returns empty list:
> >>
> >> curl -H "Accept: application/vnd.github.v3+json"
> >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
> >> > file1.json
> >>
> >> Yeah of course if you want to backup the comments and stuff, you'll
> >> need to do more.
> >> But it is already the case today, that a ton of this "history" is
> >> already in github issues, as PRs. Most recent JIRAs are just useless
> >> placeholders.
> >> Also the same risks apply to JIRA, except are not theoretical and real
> >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
> >> to sucker you into their "Atlassian Cloud":
> >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> >
> > --
> > http://www.needhamsoftware.com (work)
> > http://www.the111shift.com (play)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

tomoko.uchida.1111 at gmail

Jun 17, 2022, 6:41 PM

Post #28 of 40 (972 views)

Permalink

Does anyone have information on API access keys to Jira (preferably,
read-only and limited to Lucene project)?
https://issues.apache.org/jira/browse/LUCENE-10622

2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>
> I feel like we should delay the decision on the mingration of existing
> issues until we have a clear image of what can be done and what cannot
> be done.
>
> I'll write some migration script that preserves the issue history as
> far as possible - then come back here with some experience.
> Let's make a decision upon the concrete knowledge and information.
>
> Tomoko
>
> 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >
> > I don't intend to neglect histories in Jira... it's an important,
> > valuable asset for all of us and possible contributors in the future.
> >
> > It's important, *therefore*, I don't want to have the degraded copies
> > of them on GitHub.
> > We cannot preserve all of history - again, there should be tons of
> > unignorable information losses (timestamp, reporter, assignee,
> > markdown, metadata that cannot be ported to GitHub) if we attempt to
> > migrate the whole Jira history into Github. Rather than trying to have
> > such incomplete copies, I would preserve Jira issues in the perfectly
> > archived status, then simply refer to them.
> >
> > Tomoko
> >
> > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
> > >
> > > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
> > >
> > > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
> > >
> > > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
> > >
> > > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
> > >
> > > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
> > >
> > > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
> > >
> > > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
> > >
> > > -Gus
> > >
> > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
> > >>
> > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
> > >> >
> > >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
> > >> >
> > >>
> > >> We already have a ton of github "issues" (pull requests, since PRs are issues).
> > >> If you want to "back them up", its easy, you can paginate thru them
> > >> 100 at a time, e.g. run this command, incrementing 'page' until it
> > >> returns empty list:
> > >>
> > >> curl -H "Accept: application/vnd.github.v3+json"
> > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
> > >> > file1.json
> > >>
> > >> Yeah of course if you want to backup the comments and stuff, you'll
> > >> need to do more.
> > >> But it is already the case today, that a ton of this "history" is
> > >> already in github issues, as PRs. Most recent JIRAs are just useless
> > >> placeholders.
> > >> Also the same risks apply to JIRA, except are not theoretical and real
> > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
> > >> to sucker you into their "Atlassian Cloud":
> > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> > >
> > >
> > > --
> > > http://www.needhamsoftware.com (work)
> > > http://www.the111shift.com (play)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

tomoko.uchida.1111 at gmail

Jun 17, 2022, 8:32 PM

Post #29 of 40 (972 views)

Permalink

Replying to myself - Jira issues can be read via REST API without any
access token and we can iterate all issues by issue number.
curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557

Would you please hold the discussion for a while - it's a waste of our
time without a working prototype to me. I will be back here with a
sandbox github repo where part of existing jira issues are migrated
(with the best effort).
In the process, we could simultaneously figure out the way to operate
GitHub metadata (milestones/labels).

Tomoko

2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:

>
> Does anyone have information on API access keys to Jira (preferably,
> read-only and limited to Lucene project)?
> https://issues.apache.org/jira/browse/LUCENE-10622
>
> 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >
> > I feel like we should delay the decision on the mingration of existing
> > issues until we have a clear image of what can be done and what cannot
> > be done.
> >
> > I'll write some migration script that preserves the issue history as
> > far as possible - then come back here with some experience.
> > Let's make a decision upon the concrete knowledge and information.
> >
> > Tomoko
> >
> > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> > >
> > > I don't intend to neglect histories in Jira... it's an important,
> > > valuable asset for all of us and possible contributors in the future.
> > >
> > > It's important, *therefore*, I don't want to have the degraded copies
> > > of them on GitHub.
> > > We cannot preserve all of history - again, there should be tons of
> > > unignorable information losses (timestamp, reporter, assignee,
> > > markdown, metadata that cannot be ported to GitHub) if we attempt to
> > > migrate the whole Jira history into Github. Rather than trying to have
> > > such incomplete copies, I would preserve Jira issues in the perfectly
> > > archived status, then simply refer to them.
> > >
> > > Tomoko
> > >
> > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
> > > >
> > > > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
> > > >
> > > > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
> > > >
> > > > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
> > > >
> > > > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
> > > >
> > > > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
> > > >
> > > > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
> > > >
> > > > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
> > > >
> > > > -Gus
> > > >
> > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
> > > >>
> > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
> > > >> >
> > > >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
> > > >> >
> > > >>
> > > >> We already have a ton of github "issues" (pull requests, since PRs are issues).
> > > >> If you want to "back them up", its easy, you can paginate thru them
> > > >> 100 at a time, e.g. run this command, incrementing 'page' until it
> > > >> returns empty list:
> > > >>
> > > >> curl -H "Accept: application/vnd.github.v3+json"
> > > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
> > > >> > file1.json
> > > >>
> > > >> Yeah of course if you want to backup the comments and stuff, you'll
> > > >> need to do more.
> > > >> But it is already the case today, that a ton of this "history" is
> > > >> already in github issues, as PRs. Most recent JIRAs are just useless
> > > >> placeholders.
> > > >> Also the same risks apply to JIRA, except are not theoretical and real
> > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
> > > >> to sucker you into their "Atlassian Cloud":
> > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
> > > >>
> > > >> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>
> > > >
> > > >
> > > > --
> > > > http://www.needhamsoftware.com (work)
> > > > http://www.the111shift.com (play)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

dawid.weiss at gmail

Jun 17, 2022, 11:43 PM

Post #30 of 40 (972 views)

Permalink

Hi Tomoko,

I've added a few bullet points that script could/should handle under
LUCENE-10557, hope you don't mind. If you place these script(s) in the open
then perhaps indeed we could try to collaborate and see what can be done.

Dawid

On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com>
wrote:

> Replying to myself - Jira issues can be read via REST API without any
> access token and we can iterate all issues by issue number.
> curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557
>
> Would you please hold the discussion for a while - it's a waste of our
> time without a working prototype to me. I will be back here with a
> sandbox github repo where part of existing jira issues are migrated
> (with the best effort).
> In the process, we could simultaneously figure out the way to operate
> GitHub metadata (milestones/labels).
>
> Tomoko
>
> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>
> >
> > Does anyone have information on API access keys to Jira (preferably,
> > read-only and limited to Lucene project)?
> > https://issues.apache.org/jira/browse/LUCENE-10622
> >
> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> > >
> > > I feel like we should delay the decision on the mingration of existing
> > > issues until we have a clear image of what can be done and what cannot
> > > be done.
> > >
> > > I'll write some migration script that preserves the issue history as
> > > far as possible - then come back here with some experience.
> > > Let's make a decision upon the concrete knowledge and information.
> > >
> > > Tomoko
> > >
> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> > > >
> > > > I don't intend to neglect histories in Jira... it's an important,
> > > > valuable asset for all of us and possible contributors in the future.
> > > >
> > > > It's important, *therefore*, I don't want to have the degraded copies
> > > > of them on GitHub.
> > > > We cannot preserve all of history - again, there should be tons of
> > > > unignorable information losses (timestamp, reporter, assignee,
> > > > markdown, metadata that cannot be ported to GitHub) if we attempt to
> > > > migrate the whole Jira history into Github. Rather than trying to
> have
> > > > such incomplete copies, I would preserve Jira issues in the perfectly
> > > > archived status, then simply refer to them.
> > > >
> > > > Tomoko
> > > >
> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
> > > > >
> > > > > I hope you count me as someone who sees history as important. It's
> important in more ways than one however. You gave the example of trying to
> understand something, and looking at the issue history directly. I also
> give weight to the scenario where someone has written a blog post about the
> topic and linked the issue "For the latest see LUCENE-XXXX" for example...
> Or someone planning upgrades has a spreadsheet of things to track down...
> The existing links should point to a *complete* history of the issue.
> > > > >
> > > > > I don't see the migration of everything to github as being as
> critical as you do but I'm not at all against migrating things that are
> closed if someone wants to do that work, and perhaps even copying over
> existing open issues periodically as they become closed (and accelerating
> the close rate by aggressive closing of silent issues). No new issues in
> Jira sounds fine, even better if enforced by Jira. Proceed from here in
> Github since that's where the community wants to go. Links to the migrated
> version automatically added to Jira and/or backlinks to Jira would be just
> fine too since readers might (hopefully needlessly) worry that something
> didn't get migrated, we should make it easy to check.
> > > > >
> > > > > What I don't want is for someone to land on an issue via link or
> via google search (or via search in jira because they are using Jira
> already for some other apache project), read through it and think A) it
> never got resolved when it did or B) miss the fact that it got reopened and
> further changes were made and only have half the story... or any other
> scenario where they are looking at an incomplete record of the issue. (thus
> obfuscating/splitting the very important rich history across systems).
> > > > >
> > > > > So that's why I feel issues should be completely tracked in the
> system where they were created. Syncing old closed stuff into a new system
> probably is fine so long as there are periodic sweeps to pull in reopens or
> newly completed issues. We could even sync open things so long as they are
> clearly marked in the title as having their primary record in Jira and
> "last synced from JIRA on YYYY-MM-DD" or something in a final comment each
> time new content is brought over.
> > > > >
> > > > > For simplicity and workload however maybe just sync things when
> they close. Depends on how much effort the person writing code for syncing
> things wants to put into it I guess.
> > > > >
> > > > > Although I agree with Dawid on the "What if Elon buys it?" issue,
> that ship has sailed, the community accepts that risk and we probably
> should not rehash it.
> > > > >
> > > > > WRT Robert's comments on PRs being issues... this has already
> worried me because I've already seen a lot of discussion on PR's and I've
> worried that this stuff has the potential to get lost or be hard to find.
> If there is one key positive of this move is that they will become easier
> to find since the search in github can find it. I would say that a PR is
> not a substitute for a well described issue report but that's probably a
> separate discussion (which I would hope mirrors the policy on small edits
> like typos or adding comments/javadoc not needing an issue). I've also seen
> folks who like to clean up and remove old branches and PR's, which is
> problematic if that's where the important discussion is (possibly a 3rd can
> of worms there).
> > > > >
> > > > > -Gus
> > > > >
> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com>
> wrote:
> > > > >>
> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <
> dawid.weiss@gmail.com> wrote:
> > > > >> >
> > > > >> > I'd be more afraid of what happens to github issues in two
> years (or longer). Will it look the same? Will it be different? Will it be
> gone (and how do we get a backup of the isse history then)? Contrary to the
> apache-hosted Jira, github is very much an independent entity. If Elon Musk
> decides to buy and close it tomorrow... then what? :)
> > > > >> >
> > > > >>
> > > > >> We already have a ton of github "issues" (pull requests, since
> PRs are issues).
> > > > >> If you want to "back them up", its easy, you can paginate thru
> them
> > > > >> 100 at a time, e.g. run this command, incrementing 'page' until it
> > > > >> returns empty list:
> > > > >>
> > > > >> curl -H "Accept: application/vnd.github.v3+json"
> > > > >> "
> https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all
> "
> > > > >> > file1.json
> > > > >>
> > > > >> Yeah of course if you want to backup the comments and stuff,
> you'll
> > > > >> need to do more.
> > > > >> But it is already the case today, that a ton of this "history" is
> > > > >> already in github issues, as PRs. Most recent JIRAs are just
> useless
> > > > >> placeholders.
> > > > >> Also the same risks apply to JIRA, except are not theoretical and
> real
> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to
> try
> > > > >> to sucker you into their "Atlassian Cloud":
> > > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
> > > > >>
> > > > >>
> ---------------------------------------------------------------------
> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > http://www.needhamsoftware.com (work)
> > > > > http://www.the111shift.com (play)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

tomoko.uchida.1111 at gmail

Jun 17, 2022, 11:58 PM

Post #31 of 40 (972 views)

Permalink

I'll give it a try though, I'm really skeptical that it can be done
with a satisfactory level of quality (we want to "preserve" issue
history, not just to have shallow/degraded copies, right?), and the
migration will be significantly delayed to figure out the way to
properly moving all issues to GitHub.
if there is another way to bypass this challenge - please let me know.

Tomoko

2022?6?18?(?) 15:44 Dawid Weiss <dawid.weiss@gmail.com>:

>
>
> Hi Tomoko,
>
> I've added a few bullet points that script could/should handle under LUCENE-10557, hope you don't mind. If you place these script(s) in the open then perhaps indeed we could try to collaborate and see what can be done.
>
> Dawid
>
> On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
>>
>> Replying to myself - Jira issues can be read via REST API without any
>> access token and we can iterate all issues by issue number.
>> curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557
>>
>> Would you please hold the discussion for a while - it's a waste of our
>> time without a working prototype to me. I will be back here with a
>> sandbox github repo where part of existing jira issues are migrated
>> (with the best effort).
>> In the process, we could simultaneously figure out the way to operate
>> GitHub metadata (milestones/labels).
>>
>> Tomoko
>>
>> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>
>> >
>> > Does anyone have information on API access keys to Jira (preferably,
>> > read-only and limited to Lucene project)?
>> > https://issues.apache.org/jira/browse/LUCENE-10622
>> >
>> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> > >
>> > > I feel like we should delay the decision on the mingration of existing
>> > > issues until we have a clear image of what can be done and what cannot
>> > > be done.
>> > >
>> > > I'll write some migration script that preserves the issue history as
>> > > far as possible - then come back here with some experience.
>> > > Let's make a decision upon the concrete knowledge and information.
>> > >
>> > > Tomoko
>> > >
>> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> > > >
>> > > > I don't intend to neglect histories in Jira... it's an important,
>> > > > valuable asset for all of us and possible contributors in the future.
>> > > >
>> > > > It's important, *therefore*, I don't want to have the degraded copies
>> > > > of them on GitHub.
>> > > > We cannot preserve all of history - again, there should be tons of
>> > > > unignorable information losses (timestamp, reporter, assignee,
>> > > > markdown, metadata that cannot be ported to GitHub) if we attempt to
>> > > > migrate the whole Jira history into Github. Rather than trying to have
>> > > > such incomplete copies, I would preserve Jira issues in the perfectly
>> > > > archived status, then simply refer to them.
>> > > >
>> > > > Tomoko
>> > > >
>> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
>> > > > >
>> > > > > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
>> > > > >
>> > > > > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
>> > > > >
>> > > > > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
>> > > > >
>> > > > > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
>> > > > >
>> > > > > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
>> > > > >
>> > > > > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
>> > > > >
>> > > > > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
>> > > > >
>> > > > > -Gus
>> > > > >
>> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
>> > > > >>
>> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>> > > > >> >
>> > > > >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
>> > > > >> >
>> > > > >>
>> > > > >> We already have a ton of github "issues" (pull requests, since PRs are issues).
>> > > > >> If you want to "back them up", its easy, you can paginate thru them
>> > > > >> 100 at a time, e.g. run this command, incrementing 'page' until it
>> > > > >> returns empty list:
>> > > > >>
>> > > > >> curl -H "Accept: application/vnd.github.v3+json"
>> > > > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
>> > > > >> > file1.json
>> > > > >>
>> > > > >> Yeah of course if you want to backup the comments and stuff, you'll
>> > > > >> need to do more.
>> > > > >> But it is already the case today, that a ton of this "history" is
>> > > > >> already in github issues, as PRs. Most recent JIRAs are just useless
>> > > > >> placeholders.
>> > > > >> Also the same risks apply to JIRA, except are not theoretical and real
>> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
>> > > > >> to sucker you into their "Atlassian Cloud":
>> > > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
>> > > > >>
>> > > > >> ---------------------------------------------------------------------
>> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > > >>
>> > > > >
>> > > > >
>> > > > > --
>> > > > > http://www.needhamsoftware.com (work)
>> > > > > http://www.the111shift.com (play)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

dawid.weiss at gmail

Jun 18, 2022, 12:18 AM

Post #32 of 40 (972 views)

Permalink

I honestly don't know what can be done and what has to be sacrificed. I'm
pretty sure it'll be more difficult than svn->git conversion because more
factors are involved. One tough thing to somehow preserve may be user names
(reporters, etc.). I'm not sure how other projects dealt with that.

Perhaps a way to do it incrementally would be to create a json/xml
(structured) dump of jira content and then write a converter into a similar
json/xml dump for importing into github. I remember it took many
iterations and trial and error for svn->git conversion to eventually reach
the final shape and it was simpler and faster to do it locally.

Dawid

On Sat, Jun 18, 2022 at 8:59 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com>
wrote:

> I'll give it a try though, I'm really skeptical that it can be done
> with a satisfactory level of quality (we want to "preserve" issue
> history, not just to have shallow/degraded copies, right?), and the
> migration will be significantly delayed to figure out the way to
> properly moving all issues to GitHub.
> if there is another way to bypass this challenge - please let me know.
>
> Tomoko
>
> 2022?6?18?(?) 15:44 Dawid Weiss <dawid.weiss@gmail.com>:
>
> >
> >
> > Hi Tomoko,
> >
> > I've added a few bullet points that script could/should handle under
> LUCENE-10557, hope you don't mind. If you place these script(s) in the open
> then perhaps indeed we could try to collaborate and see what can be done.
> >
> > Dawid
> >
> > On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <
> tomoko.uchida.1111@gmail.com> wrote:
> >>
> >> Replying to myself - Jira issues can be read via REST API without any
> >> access token and we can iterate all issues by issue number.
> >> curl -s
> https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557
> >>
> >> Would you please hold the discussion for a while - it's a waste of our
> >> time without a working prototype to me. I will be back here with a
> >> sandbox github repo where part of existing jira issues are migrated
> >> (with the best effort).
> >> In the process, we could simultaneously figure out the way to operate
> >> GitHub metadata (milestones/labels).
> >>
> >> Tomoko
> >>
> >> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >>
> >> >
> >> > Does anyone have information on API access keys to Jira (preferably,
> >> > read-only and limited to Lucene project)?
> >> > https://issues.apache.org/jira/browse/LUCENE-10622
> >> >
> >> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >> > >
> >> > > I feel like we should delay the decision on the mingration of
> existing
> >> > > issues until we have a clear image of what can be done and what
> cannot
> >> > > be done.
> >> > >
> >> > > I'll write some migration script that preserves the issue history as
> >> > > far as possible - then come back here with some experience.
> >> > > Let's make a decision upon the concrete knowledge and information.
> >> > >
> >> > > Tomoko
> >> > >
> >> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >> > > >
> >> > > > I don't intend to neglect histories in Jira... it's an important,
> >> > > > valuable asset for all of us and possible contributors in the
> future.
> >> > > >
> >> > > > It's important, *therefore*, I don't want to have the degraded
> copies
> >> > > > of them on GitHub.
> >> > > > We cannot preserve all of history - again, there should be tons of
> >> > > > unignorable information losses (timestamp, reporter, assignee,
> >> > > > markdown, metadata that cannot be ported to GitHub) if we attempt
> to
> >> > > > migrate the whole Jira history into Github. Rather than trying to
> have
> >> > > > such incomplete copies, I would preserve Jira issues in the
> perfectly
> >> > > > archived status, then simply refer to them.
> >> > > >
> >> > > > Tomoko
> >> > > >
> >> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
> >> > > > >
> >> > > > > I hope you count me as someone who sees history as important.
> It's important in more ways than one however. You gave the example of
> trying to understand something, and looking at the issue history directly.
> I also give weight to the scenario where someone has written a blog post
> about the topic and linked the issue "For the latest see LUCENE-XXXX" for
> example... Or someone planning upgrades has a spreadsheet of things to
> track down... The existing links should point to a *complete* history of
> the issue.
> >> > > > >
> >> > > > > I don't see the migration of everything to github as being as
> critical as you do but I'm not at all against migrating things that are
> closed if someone wants to do that work, and perhaps even copying over
> existing open issues periodically as they become closed (and accelerating
> the close rate by aggressive closing of silent issues). No new issues in
> Jira sounds fine, even better if enforced by Jira. Proceed from here in
> Github since that's where the community wants to go. Links to the migrated
> version automatically added to Jira and/or backlinks to Jira would be just
> fine too since readers might (hopefully needlessly) worry that something
> didn't get migrated, we should make it easy to check.
> >> > > > >
> >> > > > > What I don't want is for someone to land on an issue via link
> or via google search (or via search in jira because they are using Jira
> already for some other apache project), read through it and think A) it
> never got resolved when it did or B) miss the fact that it got reopened and
> further changes were made and only have half the story... or any other
> scenario where they are looking at an incomplete record of the issue. (thus
> obfuscating/splitting the very important rich history across systems).
> >> > > > >
> >> > > > > So that's why I feel issues should be completely tracked in the
> system where they were created. Syncing old closed stuff into a new system
> probably is fine so long as there are periodic sweeps to pull in reopens or
> newly completed issues. We could even sync open things so long as they are
> clearly marked in the title as having their primary record in Jira and
> "last synced from JIRA on YYYY-MM-DD" or something in a final comment each
> time new content is brought over.
> >> > > > >
> >> > > > > For simplicity and workload however maybe just sync things when
> they close. Depends on how much effort the person writing code for syncing
> things wants to put into it I guess.
> >> > > > >
> >> > > > > Although I agree with Dawid on the "What if Elon buys it?"
> issue, that ship has sailed, the community accepts that risk and we
> probably should not rehash it.
> >> > > > >
> >> > > > > WRT Robert's comments on PRs being issues... this has already
> worried me because I've already seen a lot of discussion on PR's and I've
> worried that this stuff has the potential to get lost or be hard to find.
> If there is one key positive of this move is that they will become easier
> to find since the search in github can find it. I would say that a PR is
> not a substitute for a well described issue report but that's probably a
> separate discussion (which I would hope mirrors the policy on small edits
> like typos or adding comments/javadoc not needing an issue). I've also seen
> folks who like to clean up and remove old branches and PR's, which is
> problematic if that's where the important discussion is (possibly a 3rd can
> of worms there).
> >> > > > >
> >> > > > > -Gus
> >> > > > >
> >> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com>
> wrote:
> >> > > > >>
> >> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <
> dawid.weiss@gmail.com> wrote:
> >> > > > >> >
> >> > > > >> > I'd be more afraid of what happens to github issues in two
> years (or longer). Will it look the same? Will it be different? Will it be
> gone (and how do we get a backup of the isse history then)? Contrary to the
> apache-hosted Jira, github is very much an independent entity. If Elon Musk
> decides to buy and close it tomorrow... then what? :)
> >> > > > >> >
> >> > > > >>
> >> > > > >> We already have a ton of github "issues" (pull requests, since
> PRs are issues).
> >> > > > >> If you want to "back them up", its easy, you can paginate thru
> them
> >> > > > >> 100 at a time, e.g. run this command, incrementing 'page'
> until it
> >> > > > >> returns empty list:
> >> > > > >>
> >> > > > >> curl -H "Accept: application/vnd.github.v3+json"
> >> > > > >> "
> https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all
> "
> >> > > > >> > file1.json
> >> > > > >>
> >> > > > >> Yeah of course if you want to backup the comments and stuff,
> you'll
> >> > > > >> need to do more.
> >> > > > >> But it is already the case today, that a ton of this "history"
> is
> >> > > > >> already in github issues, as PRs. Most recent JIRAs are just
> useless
> >> > > > >> placeholders.
> >> > > > >> Also the same risks apply to JIRA, except are not theoretical
> and real
> >> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA
> to try
> >> > > > >> to sucker you into their "Atlassian Cloud":
> >> > > > >>
> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
> >> > > > >>
> >> > > > >>
> ---------------------------------------------------------------------
> >> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > http://www.needhamsoftware.com (work)
> >> > > > > http://www.the111shift.com (play)
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

rcmuir at gmail

Jun 18, 2022, 2:52 AM

Post #33 of 40 (972 views)

Permalink

I looked at some related projects on github:
https://github.com/Skraeda/jira-2-github
Does the barebones basics but helps you think of the inputs: "username
mapping", "release -> milestone mapping", etc. Of course for a
username mapping, maybe its best to just handle the top 99% or so and
let the long-tail just come across as "full name". I also find plenty
of projects that convert "special jira language" to markdown, e.g.
https://github.com/catcombo/jira2markdown
I'm not convinced conversion would be degraded, with a little bit of
thought into the conversion, I think it could actually be *better*.
github issues can do everything jira can, just without the fussy UI.
e.g. issues can have attachments (for all the patch files), and
attachment names can have duplicates. Issues can link to other issues,
commits, or PRs easily.

It just depends on how much we want to invest into it. If we want to
really go whole-hog, then when we do the initial JIRA->issue
conversion, we should *save that mapping* as a .CSV file or similar.
Because later we could then use it to find/replace URLs in
Changes.txt, source code, benchmark annotations, etc etc. Let's at
least leave the possibility open to do that work as followup.

I find the idea that we're stuck looking at JIRA forever ridiculous.

On Sat, Jun 18, 2022 at 3:19 AM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>
>
> I honestly don't know what can be done and what has to be sacrificed. I'm pretty sure it'll be more difficult than svn->git conversion because more factors are involved. One tough thing to somehow preserve may be user names (reporters, etc.). I'm not sure how other projects dealt with that.
>
> Perhaps a way to do it incrementally would be to create a json/xml (structured) dump of jira content and then write a converter into a similar json/xml dump for importing into github. I remember it took many iterations and trial and error for svn->git conversion to eventually reach the final shape and it was simpler and faster to do it locally.
>
> Dawid
>
> On Sat, Jun 18, 2022 at 8:59 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
>>
>> I'll give it a try though, I'm really skeptical that it can be done
>> with a satisfactory level of quality (we want to "preserve" issue
>> history, not just to have shallow/degraded copies, right?), and the
>> migration will be significantly delayed to figure out the way to
>> properly moving all issues to GitHub.
>> if there is another way to bypass this challenge - please let me know.
>>
>> Tomoko
>>
>> 2022?6?18?(?) 15:44 Dawid Weiss <dawid.weiss@gmail.com>:
>>
>> >
>> >
>> > Hi Tomoko,
>> >
>> > I've added a few bullet points that script could/should handle under LUCENE-10557, hope you don't mind. If you place these script(s) in the open then perhaps indeed we could try to collaborate and see what can be done.
>> >
>> > Dawid
>> >
>> > On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
>> >>
>> >> Replying to myself - Jira issues can be read via REST API without any
>> >> access token and we can iterate all issues by issue number.
>> >> curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557
>> >>
>> >> Would you please hold the discussion for a while - it's a waste of our
>> >> time without a working prototype to me. I will be back here with a
>> >> sandbox github repo where part of existing jira issues are migrated
>> >> (with the best effort).
>> >> In the process, we could simultaneously figure out the way to operate
>> >> GitHub metadata (milestones/labels).
>> >>
>> >> Tomoko
>> >>
>> >> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> >>
>> >> >
>> >> > Does anyone have information on API access keys to Jira (preferably,
>> >> > read-only and limited to Lucene project)?
>> >> > https://issues.apache.org/jira/browse/LUCENE-10622
>> >> >
>> >> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> >> > >
>> >> > > I feel like we should delay the decision on the mingration of existing
>> >> > > issues until we have a clear image of what can be done and what cannot
>> >> > > be done.
>> >> > >
>> >> > > I'll write some migration script that preserves the issue history as
>> >> > > far as possible - then come back here with some experience.
>> >> > > Let's make a decision upon the concrete knowledge and information.
>> >> > >
>> >> > > Tomoko
>> >> > >
>> >> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> >> > > >
>> >> > > > I don't intend to neglect histories in Jira... it's an important,
>> >> > > > valuable asset for all of us and possible contributors in the future.
>> >> > > >
>> >> > > > It's important, *therefore*, I don't want to have the degraded copies
>> >> > > > of them on GitHub.
>> >> > > > We cannot preserve all of history - again, there should be tons of
>> >> > > > unignorable information losses (timestamp, reporter, assignee,
>> >> > > > markdown, metadata that cannot be ported to GitHub) if we attempt to
>> >> > > > migrate the whole Jira history into Github. Rather than trying to have
>> >> > > > such incomplete copies, I would preserve Jira issues in the perfectly
>> >> > > > archived status, then simply refer to them.
>> >> > > >
>> >> > > > Tomoko
>> >> > > >
>> >> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
>> >> > > > >
>> >> > > > > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
>> >> > > > >
>> >> > > > > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
>> >> > > > >
>> >> > > > > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
>> >> > > > >
>> >> > > > > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
>> >> > > > >
>> >> > > > > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
>> >> > > > >
>> >> > > > > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
>> >> > > > >
>> >> > > > > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
>> >> > > > >
>> >> > > > > -Gus
>> >> > > > >
>> >> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
>> >> > > > >>
>> >> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>> >> > > > >> >
>> >> > > > >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
>> >> > > > >> >
>> >> > > > >>
>> >> > > > >> We already have a ton of github "issues" (pull requests, since PRs are issues).
>> >> > > > >> If you want to "back them up", its easy, you can paginate thru them
>> >> > > > >> 100 at a time, e.g. run this command, incrementing 'page' until it
>> >> > > > >> returns empty list:
>> >> > > > >>
>> >> > > > >> curl -H "Accept: application/vnd.github.v3+json"
>> >> > > > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
>> >> > > > >> > file1.json
>> >> > > > >>
>> >> > > > >> Yeah of course if you want to backup the comments and stuff, you'll
>> >> > > > >> need to do more.
>> >> > > > >> But it is already the case today, that a ton of this "history" is
>> >> > > > >> already in github issues, as PRs. Most recent JIRAs are just useless
>> >> > > > >> placeholders.
>> >> > > > >> Also the same risks apply to JIRA, except are not theoretical and real
>> >> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
>> >> > > > >> to sucker you into their "Atlassian Cloud":
>> >> > > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
>> >> > > > >>
>> >> > > > >> ---------------------------------------------------------------------
>> >> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> > > > >>
>> >> > > > >
>> >> > > > >
>> >> > > > > --
>> >> > > > > http://www.needhamsoftware.com (work)
>> >> > > > > http://www.the111shift.com (play)
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

tomoko.uchida.1111 at gmail

Jun 18, 2022, 4:41 AM

Post #34 of 40 (972 views)

Permalink

User id mapping is an important consideration for me.

Can we find a mapping from Jira user id to GitHub account anywhere?
Don't we have to gain the consent of each individual to map both accounts?

2022?6?18?(?) 18:52 Robert Muir <rcmuir@gmail.com>:
>
> I looked at some related projects on github:
> https://github.com/Skraeda/jira-2-github
> Does the barebones basics but helps you think of the inputs: "username
> mapping", "release -> milestone mapping", etc. Of course for a
> username mapping, maybe its best to just handle the top 99% or so and
> let the long-tail just come across as "full name". I also find plenty
> of projects that convert "special jira language" to markdown, e.g.
> https://github.com/catcombo/jira2markdown
> I'm not convinced conversion would be degraded, with a little bit of
> thought into the conversion, I think it could actually be *better*.
> github issues can do everything jira can, just without the fussy UI.
> e.g. issues can have attachments (for all the patch files), and
> attachment names can have duplicates. Issues can link to other issues,
> commits, or PRs easily.
>
> It just depends on how much we want to invest into it. If we want to
> really go whole-hog, then when we do the initial JIRA->issue
> conversion, we should *save that mapping* as a .CSV file or similar.
> Because later we could then use it to find/replace URLs in
> Changes.txt, source code, benchmark annotations, etc etc. Let's at
> least leave the possibility open to do that work as followup.
>
> I find the idea that we're stuck looking at JIRA forever ridiculous.
>
> On Sat, Jun 18, 2022 at 3:19 AM Dawid Weiss <dawid.weiss@gmail.com> wrote:
> >
> >
> > I honestly don't know what can be done and what has to be sacrificed. I'm pretty sure it'll be more difficult than svn->git conversion because more factors are involved. One tough thing to somehow preserve may be user names (reporters, etc.). I'm not sure how other projects dealt with that.
> >
> > Perhaps a way to do it incrementally would be to create a json/xml (structured) dump of jira content and then write a converter into a similar json/xml dump for importing into github. I remember it took many iterations and trial and error for svn->git conversion to eventually reach the final shape and it was simpler and faster to do it locally.
> >
> > Dawid
> >
> > On Sat, Jun 18, 2022 at 8:59 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
> >>
> >> I'll give it a try though, I'm really skeptical that it can be done
> >> with a satisfactory level of quality (we want to "preserve" issue
> >> history, not just to have shallow/degraded copies, right?), and the
> >> migration will be significantly delayed to figure out the way to
> >> properly moving all issues to GitHub.
> >> if there is another way to bypass this challenge - please let me know.
> >>
> >> Tomoko
> >>
> >> 2022?6?18?(?) 15:44 Dawid Weiss <dawid.weiss@gmail.com>:
> >>
> >> >
> >> >
> >> > Hi Tomoko,
> >> >
> >> > I've added a few bullet points that script could/should handle under LUCENE-10557, hope you don't mind. If you place these script(s) in the open then perhaps indeed we could try to collaborate and see what can be done.
> >> >
> >> > Dawid
> >> >
> >> > On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
> >> >>
> >> >> Replying to myself - Jira issues can be read via REST API without any
> >> >> access token and we can iterate all issues by issue number.
> >> >> curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557
> >> >>
> >> >> Would you please hold the discussion for a while - it's a waste of our
> >> >> time without a working prototype to me. I will be back here with a
> >> >> sandbox github repo where part of existing jira issues are migrated
> >> >> (with the best effort).
> >> >> In the process, we could simultaneously figure out the way to operate
> >> >> GitHub metadata (milestones/labels).
> >> >>
> >> >> Tomoko
> >> >>
> >> >> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >> >>
> >> >> >
> >> >> > Does anyone have information on API access keys to Jira (preferably,
> >> >> > read-only and limited to Lucene project)?
> >> >> > https://issues.apache.org/jira/browse/LUCENE-10622
> >> >> >
> >> >> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >> >> > >
> >> >> > > I feel like we should delay the decision on the mingration of existing
> >> >> > > issues until we have a clear image of what can be done and what cannot
> >> >> > > be done.
> >> >> > >
> >> >> > > I'll write some migration script that preserves the issue history as
> >> >> > > far as possible - then come back here with some experience.
> >> >> > > Let's make a decision upon the concrete knowledge and information.
> >> >> > >
> >> >> > > Tomoko
> >> >> > >
> >> >> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> >> >> > > >
> >> >> > > > I don't intend to neglect histories in Jira... it's an important,
> >> >> > > > valuable asset for all of us and possible contributors in the future.
> >> >> > > >
> >> >> > > > It's important, *therefore*, I don't want to have the degraded copies
> >> >> > > > of them on GitHub.
> >> >> > > > We cannot preserve all of history - again, there should be tons of
> >> >> > > > unignorable information losses (timestamp, reporter, assignee,
> >> >> > > > markdown, metadata that cannot be ported to GitHub) if we attempt to
> >> >> > > > migrate the whole Jira history into Github. Rather than trying to have
> >> >> > > > such incomplete copies, I would preserve Jira issues in the perfectly
> >> >> > > > archived status, then simply refer to them.
> >> >> > > >
> >> >> > > > Tomoko
> >> >> > > >
> >> >> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
> >> >> > > > >
> >> >> > > > > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
> >> >> > > > >
> >> >> > > > > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
> >> >> > > > >
> >> >> > > > > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
> >> >> > > > >
> >> >> > > > > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
> >> >> > > > >
> >> >> > > > > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
> >> >> > > > >
> >> >> > > > > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
> >> >> > > > >
> >> >> > > > > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
> >> >> > > > >
> >> >> > > > > -Gus
> >> >> > > > >
> >> >> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
> >> >> > > > >>
> >> >> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
> >> >> > > > >> >
> >> >> > > > >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
> >> >> > > > >> >
> >> >> > > > >>
> >> >> > > > >> We already have a ton of github "issues" (pull requests, since PRs are issues).
> >> >> > > > >> If you want to "back them up", its easy, you can paginate thru them
> >> >> > > > >> 100 at a time, e.g. run this command, incrementing 'page' until it
> >> >> > > > >> returns empty list:
> >> >> > > > >>
> >> >> > > > >> curl -H "Accept: application/vnd.github.v3+json"
> >> >> > > > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
> >> >> > > > >> > file1.json
> >> >> > > > >>
> >> >> > > > >> Yeah of course if you want to backup the comments and stuff, you'll
> >> >> > > > >> need to do more.
> >> >> > > > >> But it is already the case today, that a ton of this "history" is
> >> >> > > > >> already in github issues, as PRs. Most recent JIRAs are just useless
> >> >> > > > >> placeholders.
> >> >> > > > >> Also the same risks apply to JIRA, except are not theoretical and real
> >> >> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
> >> >> > > > >> to sucker you into their "Atlassian Cloud":
> >> >> > > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
> >> >> > > > >>
> >> >> > > > >> ---------------------------------------------------------------------
> >> >> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >> > > > >>
> >> >> > > > >
> >> >> > > > >
> >> >> > > > > --
> >> >> > > > > http://www.needhamsoftware.com (work)
> >> >> > > > > http://www.the111shift.com (play)
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

rcmuir at gmail

Jun 18, 2022, 12:48 PM

Post #35 of 40 (972 views)

Permalink

On Sat, Jun 18, 2022, 7:42 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com>
wrote:

> User id mapping is an important consideration for me.
>
> Can we find a mapping from Jira user id to GitHub account anywhere?
>

I think we would have to create it. But my hope would be that maybe 50-100
names would cover large majority of issues.

> Don't we have to gain the consent of each individual to map both accounts?
>

No, we don't have to ask permission to mention someone with an @username

> 2022?6?18?(?) 18:52 Robert Muir <rcmuir@gmail.com>:
> >
> > I looked at some related projects on github:
> > https://github.com/Skraeda/jira-2-github
> > Does the barebones basics but helps you think of the inputs: "username
> > mapping", "release -> milestone mapping", etc. Of course for a
> > username mapping, maybe its best to just handle the top 99% or so and
> > let the long-tail just come across as "full name". I also find plenty
> > of projects that convert "special jira language" to markdown, e.g.
> > https://github.com/catcombo/jira2markdown
> > I'm not convinced conversion would be degraded, with a little bit of
> > thought into the conversion, I think it could actually be *better*.
> > github issues can do everything jira can, just without the fussy UI.
> > e.g. issues can have attachments (for all the patch files), and
> > attachment names can have duplicates. Issues can link to other issues,
> > commits, or PRs easily.
> >
> > It just depends on how much we want to invest into it. If we want to
> > really go whole-hog, then when we do the initial JIRA->issue
> > conversion, we should *save that mapping* as a .CSV file or similar.
> > Because later we could then use it to find/replace URLs in
> > Changes.txt, source code, benchmark annotations, etc etc. Let's at
> > least leave the possibility open to do that work as followup.
> >
> > I find the idea that we're stuck looking at JIRA forever ridiculous.
> >
> > On Sat, Jun 18, 2022 at 3:19 AM Dawid Weiss <dawid.weiss@gmail.com>
> wrote:
> > >
> > >
> > > I honestly don't know what can be done and what has to be sacrificed.
> I'm pretty sure it'll be more difficult than svn->git conversion because
> more factors are involved. One tough thing to somehow preserve may be user
> names (reporters, etc.). I'm not sure how other projects dealt with that.
> > >
> > > Perhaps a way to do it incrementally would be to create a json/xml
> (structured) dump of jira content and then write a converter into a similar
> json/xml dump for importing into github. I remember it took many iterations
> and trial and error for svn->git conversion to eventually reach the final
> shape and it was simpler and faster to do it locally.
> > >
> > > Dawid
> > >
> > > On Sat, Jun 18, 2022 at 8:59 AM Tomoko Uchida <
> tomoko.uchida.1111@gmail.com> wrote:
> > >>
> > >> I'll give it a try though, I'm really skeptical that it can be done
> > >> with a satisfactory level of quality (we want to "preserve" issue
> > >> history, not just to have shallow/degraded copies, right?), and the
> > >> migration will be significantly delayed to figure out the way to
> > >> properly moving all issues to GitHub.
> > >> if there is another way to bypass this challenge - please let me know.
> > >>
> > >> Tomoko
> > >>
> > >> 2022?6?18?(?) 15:44 Dawid Weiss <dawid.weiss@gmail.com>:
> > >>
> > >> >
> > >> >
> > >> > Hi Tomoko,
> > >> >
> > >> > I've added a few bullet points that script could/should handle
> under LUCENE-10557, hope you don't mind. If you place these script(s) in
> the open then perhaps indeed we could try to collaborate and see what can
> be done.
> > >> >
> > >> > Dawid
> > >> >
> > >> > On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <
> tomoko.uchida.1111@gmail.com> wrote:
> > >> >>
> > >> >> Replying to myself - Jira issues can be read via REST API without
> any
> > >> >> access token and we can iterate all issues by issue number.
> > >> >> curl -s
> https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557
> > >> >>
> > >> >> Would you please hold the discussion for a while - it's a waste of
> our
> > >> >> time without a working prototype to me. I will be back here with a
> > >> >> sandbox github repo where part of existing jira issues are migrated
> > >> >> (with the best effort).
> > >> >> In the process, we could simultaneously figure out the way to
> operate
> > >> >> GitHub metadata (milestones/labels).
> > >> >>
> > >> >> Tomoko
> > >> >>
> > >> >> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
> > >> >>
> > >> >> >
> > >> >> > Does anyone have information on API access keys to Jira
> (preferably,
> > >> >> > read-only and limited to Lucene project)?
> > >> >> > https://issues.apache.org/jira/browse/LUCENE-10622
> > >> >> >
> > >> >> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com
> >:
> > >> >> > >
> > >> >> > > I feel like we should delay the decision on the mingration of
> existing
> > >> >> > > issues until we have a clear image of what can be done and
> what cannot
> > >> >> > > be done.
> > >> >> > >
> > >> >> > > I'll write some migration script that preserves the issue
> history as
> > >> >> > > far as possible - then come back here with some experience.
> > >> >> > > Let's make a decision upon the concrete knowledge and
> information.
> > >> >> > >
> > >> >> > > Tomoko
> > >> >> > >
> > >> >> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com
> >:
> > >> >> > > >
> > >> >> > > > I don't intend to neglect histories in Jira... it's an
> important,
> > >> >> > > > valuable asset for all of us and possible contributors in
> the future.
> > >> >> > > >
> > >> >> > > > It's important, *therefore*, I don't want to have the
> degraded copies
> > >> >> > > > of them on GitHub.
> > >> >> > > > We cannot preserve all of history - again, there should be
> tons of
> > >> >> > > > unignorable information losses (timestamp, reporter,
> assignee,
> > >> >> > > > markdown, metadata that cannot be ported to GitHub) if we
> attempt to
> > >> >> > > > migrate the whole Jira history into Github. Rather than
> trying to have
> > >> >> > > > such incomplete copies, I would preserve Jira issues in the
> perfectly
> > >> >> > > > archived status, then simply refer to them.
> > >> >> > > >
> > >> >> > > > Tomoko
> > >> >> > > >
> > >> >> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
> > >> >> > > > >
> > >> >> > > > > I hope you count me as someone who sees history as
> important. It's important in more ways than one however. You gave the
> example of trying to understand something, and looking at the issue history
> directly. I also give weight to the scenario where someone has written a
> blog post about the topic and linked the issue "For the latest see
> LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet
> of things to track down... The existing links should point to a *complete*
> history of the issue.
> > >> >> > > > >
> > >> >> > > > > I don't see the migration of everything to github as being
> as critical as you do but I'm not at all against migrating things that are
> closed if someone wants to do that work, and perhaps even copying over
> existing open issues periodically as they become closed (and accelerating
> the close rate by aggressive closing of silent issues). No new issues in
> Jira sounds fine, even better if enforced by Jira. Proceed from here in
> Github since that's where the community wants to go. Links to the migrated
> version automatically added to Jira and/or backlinks to Jira would be just
> fine too since readers might (hopefully needlessly) worry that something
> didn't get migrated, we should make it easy to check.
> > >> >> > > > >
> > >> >> > > > > What I don't want is for someone to land on an issue via
> link or via google search (or via search in jira because they are using
> Jira already for some other apache project), read through it and think A)
> it never got resolved when it did or B) miss the fact that it got reopened
> and further changes were made and only have half the story... or any other
> scenario where they are looking at an incomplete record of the issue. (thus
> obfuscating/splitting the very important rich history across systems).
> > >> >> > > > >
> > >> >> > > > > So that's why I feel issues should be completely tracked
> in the system where they were created. Syncing old closed stuff into a new
> system probably is fine so long as there are periodic sweeps to pull in
> reopens or newly completed issues. We could even sync open things so long
> as they are clearly marked in the title as having their primary record in
> Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final
> comment each time new content is brought over.
> > >> >> > > > >
> > >> >> > > > > For simplicity and workload however maybe just sync things
> when they close. Depends on how much effort the person writing code for
> syncing things wants to put into it I guess.
> > >> >> > > > >
> > >> >> > > > > Although I agree with Dawid on the "What if Elon buys it?"
> issue, that ship has sailed, the community accepts that risk and we
> probably should not rehash it.
> > >> >> > > > >
> > >> >> > > > > WRT Robert's comments on PRs being issues... this has
> already worried me because I've already seen a lot of discussion on PR's
> and I've worried that this stuff has the potential to get lost or be hard
> to find. If there is one key positive of this move is that they will become
> easier to find since the search in github can find it. I would say that a
> PR is not a substitute for a well described issue report but that's
> probably a separate discussion (which I would hope mirrors the policy on
> small edits like typos or adding comments/javadoc not needing an issue).
> I've also seen folks who like to clean up and remove old branches and PR's,
> which is problematic if that's where the important discussion is (possibly
> a 3rd can of worms there).
> > >> >> > > > >
> > >> >> > > > > -Gus
> > >> >> > > > >
> > >> >> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <
> rcmuir@gmail.com> wrote:
> > >> >> > > > >>
> > >> >> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <
> dawid.weiss@gmail.com> wrote:
> > >> >> > > > >> >
> > >> >> > > > >> > I'd be more afraid of what happens to github issues in
> two years (or longer). Will it look the same? Will it be different? Will it
> be gone (and how do we get a backup of the isse history then)? Contrary to
> the apache-hosted Jira, github is very much an independent entity. If Elon
> Musk decides to buy and close it tomorrow... then what? :)
> > >> >> > > > >> >
> > >> >> > > > >>
> > >> >> > > > >> We already have a ton of github "issues" (pull requests,
> since PRs are issues).
> > >> >> > > > >> If you want to "back them up", its easy, you can paginate
> thru them
> > >> >> > > > >> 100 at a time, e.g. run this command, incrementing 'page'
> until it
> > >> >> > > > >> returns empty list:
> > >> >> > > > >>
> > >> >> > > > >> curl -H "Accept: application/vnd.github.v3+json"
> > >> >> > > > >> "
> https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all
> "
> > >> >> > > > >> > file1.json
> > >> >> > > > >>
> > >> >> > > > >> Yeah of course if you want to backup the comments and
> stuff, you'll
> > >> >> > > > >> need to do more.
> > >> >> > > > >> But it is already the case today, that a ton of this
> "history" is
> > >> >> > > > >> already in github issues, as PRs. Most recent JIRAs are
> just useless
> > >> >> > > > >> placeholders.
> > >> >> > > > >> Also the same risks apply to JIRA, except are not
> theoretical and real
> > >> >> > > > >> concerns, no? I thought Atlassian had deprecated "onsite"
> JIRA to try
> > >> >> > > > >> to sucker you into their "Atlassian Cloud":
> > >> >> > > > >>
> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
> > >> >> > > > >>
> > >> >> > > > >>
> ---------------------------------------------------------------------
> > >> >> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >> >> > > > >> For additional commands, e-mail:
> dev-help@lucene.apache.org
> > >> >> > > > >>
> > >> >> > > > >
> > >> >> > > > >
> > >> >> > > > > --
> > >> >> > > > > http://www.needhamsoftware.com (work)
> > >> >> > > > > http://www.the111shift.com (play)
> > >> >>
> > >> >>
> ---------------------------------------------------------------------
> > >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >> >> For additional commands, e-mail: dev-help@lucene.apache.org
> > >> >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

jan.asf at cominvent

Jun 18, 2022, 3:18 PM

Post #36 of 40 (971 views)

Permalink

You'll not be able to create a GH issue with a script, on behalf of the original reporter.
Neither will you be able to add a GH issue comment impersonating the person who made the JIRA comment.
Rather, all issues and all comments would be made by the script/bot user, and you'd have to add in free-text information about reporter and commenter.
You may be able to assign an issue to the github-id of the original JIRA assignee though.

See https://stackoverflow.com/questions/36540508/github-api-possible-to-set-reporter-of-issue-to-another-user-when-creating-issu <https://stackoverflow.com/questions/36540508/github-api-possible-to-set-reporter-of-issue-to-another-user-when-creating-issu>

I'm still against duplicating jira history over to GitHub. I think it will be a confusing mess and not worth the effort. The history split can be mitigated by documentation, and perhaps a search engine :)

If, however, experiments show that a quality migration is possible, then can we please open a separate git repo only for the historic issues? It is easy to search across two repos in GitHub, e.g. https://github.com/pulls?q=is%3Apr+is%3Aclosed+repo%3Aapache%2Flucene+repo%3Aapache%2Flucene-solr+wizard+ <https://github.com/pulls?q=is:pr+is:closed+repo:apache/lucene+repo:apache/lucene-solr+wizard+>

Jan

> 18. jun. 2022 kl. 21:48 skrev Robert Muir <rcmuir@gmail.com>:
>
>
>
> On Sat, Jun 18, 2022, 7:42 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> User id mapping is an important consideration for me.
>
> Can we find a mapping from Jira user id to GitHub account anywhere?
>
> I think we would have to create it. But my hope would be that maybe 50-100 names would cover large majority of issues.
> Don't we have to gain the consent of each individual to map both accounts?
>
> No, we don't have to ask permission to mention someone with an @username
>
>
> 2022?6?18?(?) 18:52 Robert Muir <rcmuir@gmail.com <mailto:rcmuir@gmail.com>>:
> >
> > I looked at some related projects on github:
> > https://github.com/Skraeda/jira-2-github <https://github.com/Skraeda/jira-2-github>
> > Does the barebones basics but helps you think of the inputs: "username
> > mapping", "release -> milestone mapping", etc. Of course for a
> > username mapping, maybe its best to just handle the top 99% or so and
> > let the long-tail just come across as "full name". I also find plenty
> > of projects that convert "special jira language" to markdown, e.g.
> > https://github.com/catcombo/jira2markdown <https://github.com/catcombo/jira2markdown>
> > I'm not convinced conversion would be degraded, with a little bit of
> > thought into the conversion, I think it could actually be *better*.
> > github issues can do everything jira can, just without the fussy UI.
> > e.g. issues can have attachments (for all the patch files), and
> > attachment names can have duplicates. Issues can link to other issues,
> > commits, or PRs easily.
> >
> > It just depends on how much we want to invest into it. If we want to
> > really go whole-hog, then when we do the initial JIRA->issue
> > conversion, we should *save that mapping* as a .CSV file or similar.
> > Because later we could then use it to find/replace URLs in
> > Changes.txt, source code, benchmark annotations, etc etc. Let's at
> > least leave the possibility open to do that work as followup.
> >
> > I find the idea that we're stuck looking at JIRA forever ridiculous.
> >
> > On Sat, Jun 18, 2022 at 3:19 AM Dawid Weiss <dawid.weiss@gmail.com <mailto:dawid.weiss@gmail.com>> wrote:
> > >
> > >
> > > I honestly don't know what can be done and what has to be sacrificed. I'm pretty sure it'll be more difficult than svn->git conversion because more factors are involved. One tough thing to somehow preserve may be user names (reporters, etc.). I'm not sure how other projects dealt with that.
> > >
> > > Perhaps a way to do it incrementally would be to create a json/xml (structured) dump of jira content and then write a converter into a similar json/xml dump for importing into github. I remember it took many iterations and trial and error for svn->git conversion to eventually reach the final shape and it was simpler and faster to do it locally.
> > >
> > > Dawid
> > >
> > > On Sat, Jun 18, 2022 at 8:59 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> > >>
> > >> I'll give it a try though, I'm really skeptical that it can be done
> > >> with a satisfactory level of quality (we want to "preserve" issue
> > >> history, not just to have shallow/degraded copies, right?), and the
> > >> migration will be significantly delayed to figure out the way to
> > >> properly moving all issues to GitHub.
> > >> if there is another way to bypass this challenge - please let me know.
> > >>
> > >> Tomoko
> > >>
> > >> 2022?6?18?(?) 15:44 Dawid Weiss <dawid.weiss@gmail.com <mailto:dawid.weiss@gmail.com>>:
> > >>
> > >> >
> > >> >
> > >> > Hi Tomoko,
> > >> >
> > >> > I've added a few bullet points that script could/should handle under LUCENE-10557, hope you don't mind. If you place these script(s) in the open then perhaps indeed we could try to collaborate and see what can be done.
> > >> >
> > >> > Dawid
> > >> >
> > >> > On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com <mailto:tomoko.uchida.1111@gmail.com>> wrote:
> > >> >>
> > >> >> Replying to myself - Jira issues can be read via REST API without any
> > >> >> access token and we can iterate all issues by issue number.
> > >> >> curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557 <https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557>
> > >> >>
> > >> >> Would you please hold the discussion for a while - it's a waste of our
> > >> >> time without a working prototype to me. I will be back here with a
> > >> >> sandbox github repo where part of existing jira issues are migrated
> > >> >> (with the best effort).
> > >> >> In the process, we could simultaneously figure out the way to operate
> > >> >> GitHub metadata (milestones/labels).
> > >> >>
> > >> >> Tomoko
> > >> >>
> > >> >> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com <mailto:tomoko.uchida.1111@gmail.com>>:
> > >> >>
> > >> >> >
> > >> >> > Does anyone have information on API access keys to Jira (preferably,
> > >> >> > read-only and limited to Lucene project)?
> > >> >> > https://issues.apache.org/jira/browse/LUCENE-10622 <https://issues.apache.org/jira/browse/LUCENE-10622>
> > >> >> >
> > >> >> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com <mailto:tomoko.uchida.1111@gmail.com>>:
> > >> >> > >
> > >> >> > > I feel like we should delay the decision on the mingration of existing
> > >> >> > > issues until we have a clear image of what can be done and what cannot
> > >> >> > > be done.
> > >> >> > >
> > >> >> > > I'll write some migration script that preserves the issue history as
> > >> >> > > far as possible - then come back here with some experience.
> > >> >> > > Let's make a decision upon the concrete knowledge and information.
> > >> >> > >
> > >> >> > > Tomoko
> > >> >> > >
> > >> >> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com <mailto:tomoko.uchida.1111@gmail.com>>:
> > >> >> > > >
> > >> >> > > > I don't intend to neglect histories in Jira... it's an important,
> > >> >> > > > valuable asset for all of us and possible contributors in the future.
> > >> >> > > >
> > >> >> > > > It's important, *therefore*, I don't want to have the degraded copies
> > >> >> > > > of them on GitHub.
> > >> >> > > > We cannot preserve all of history - again, there should be tons of
> > >> >> > > > unignorable information losses (timestamp, reporter, assignee,
> > >> >> > > > markdown, metadata that cannot be ported to GitHub) if we attempt to
> > >> >> > > > migrate the whole Jira history into Github. Rather than trying to have
> > >> >> > > > such incomplete copies, I would preserve Jira issues in the perfectly
> > >> >> > > > archived status, then simply refer to them.
> > >> >> > > >
> > >> >> > > > Tomoko
> > >> >> > > >
> > >> >> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com <mailto:gus.heck@gmail.com>>:
> > >> >> > > > >
> > >> >> > > > > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
> > >> >> > > > >
> > >> >> > > > > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
> > >> >> > > > >
> > >> >> > > > > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
> > >> >> > > > >
> > >> >> > > > > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
> > >> >> > > > >
> > >> >> > > > > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
> > >> >> > > > >
> > >> >> > > > > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
> > >> >> > > > >
> > >> >> > > > > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
> > >> >> > > > >
> > >> >> > > > > -Gus
> > >> >> > > > >
> > >> >> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com <mailto:rcmuir@gmail.com>> wrote:
> > >> >> > > > >>
> > >> >> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com <mailto:dawid.weiss@gmail.com>> wrote:
> > >> >> > > > >> >
> > >> >> > > > >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
> > >> >> > > > >> >
> > >> >> > > > >>
> > >> >> > > > >> We already have a ton of github "issues" (pull requests, since PRs are issues).
> > >> >> > > > >> If you want to "back them up", its easy, you can paginate thru them
> > >> >> > > > >> 100 at a time, e.g. run this command, incrementing 'page' until it
> > >> >> > > > >> returns empty list:
> > >> >> > > > >>
> > >> >> > > > >> curl -H "Accept: application/vnd.github.v3+json"
> > >> >> > > > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all <https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all>"
> > >> >> > > > >> > file1.json
> > >> >> > > > >>
> > >> >> > > > >> Yeah of course if you want to backup the comments and stuff, you'll
> > >> >> > > > >> need to do more.
> > >> >> > > > >> But it is already the case today, that a ton of this "history" is
> > >> >> > > > >> already in github issues, as PRs. Most recent JIRAs are just useless
> > >> >> > > > >> placeholders.
> > >> >> > > > >> Also the same risks apply to JIRA, except are not theoretical and real
> > >> >> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
> > >> >> > > > >> to sucker you into their "Atlassian Cloud":
> > >> >> > > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/ <https://www.theregister.com/2020/10/19/atlassian_server_licenses/>
> > >> >> > > > >>
> > >> >> > > > >> ---------------------------------------------------------------------
> > >> >> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org <mailto:dev-unsubscribe@lucene.apache.org>
> > >> >> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org <mailto:dev-help@lucene.apache.org>
> > >> >> > > > >>
> > >> >> > > > >
> > >> >> > > > >
> > >> >> > > > > --
> > >> >> > > > > http://www.needhamsoftware.com <http://www.needhamsoftware.com/> (work)
> > >> >> > > > > http://www.the111shift.com <http://www.the111shift.com/> (play)
> > >> >>
> > >> >> ---------------------------------------------------------------------
> > >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org <mailto:dev-unsubscribe@lucene.apache.org>
> > >> >> For additional commands, e-mail: dev-help@lucene.apache.org <mailto:dev-help@lucene.apache.org>
> > >> >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org <mailto:dev-unsubscribe@lucene.apache.org>
> > >> For additional commands, e-mail: dev-help@lucene.apache.org <mailto:dev-help@lucene.apache.org>
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org <mailto:dev-unsubscribe@lucene.apache.org>
> > For additional commands, e-mail: dev-help@lucene.apache.org <mailto:dev-help@lucene.apache.org>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org <mailto:dev-unsubscribe@lucene.apache.org>
> For additional commands, e-mail: dev-help@lucene.apache.org <mailto:dev-help@lucene.apache.org>
>

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

tomoko.uchida.1111 at gmail

Jun 18, 2022, 7:10 PM

Post #37 of 40 (970 views)

Permalink

Thanks Robert and Jan, for your information.

> I'm still against duplicating jira history over to GitHub. I think it will be a confusing mess and not worth the effort.

I am against it to have duplicated issue history in two issue systems
too. I think it's a kind of my duty to provide information on what
can/cannot be done on it not to be stuck on an argument without basis.

Will you all please hold the discussion on whether we "should" migrate
existing issues or not for a little while.
I'll try to give a working example - let's see how it looks, then make
the final decision.

Tomoko

2022?6?19?(?) 7:25 Jan Høydahl <jan.asf@cominvent.com>:
>
> You'll not be able to create a GH issue with a script, on behalf of the original reporter.
> Neither will you be able to add a GH issue comment impersonating the person who made the JIRA comment.
> Rather, all issues and all comments would be made by the script/bot user, and you'd have to add in free-text information about reporter and commenter.
> You may be able to assign an issue to the github-id of the original JIRA assignee though.
>
> See https://stackoverflow.com/questions/36540508/github-api-possible-to-set-reporter-of-issue-to-another-user-when-creating-issu
>
> I'm still against duplicating jira history over to GitHub. I think it will be a confusing mess and not worth the effort. The history split can be mitigated by documentation, and perhaps a search engine :)
>
> If, however, experiments show that a quality migration is possible, then can we please open a separate git repo only for the historic issues? It is easy to search across two repos in GitHub, e.g. https://github.com/pulls?q=is%3Apr+is%3Aclosed+repo%3Aapache%2Flucene+repo%3Aapache%2Flucene-solr+wizard+
>
> Jan
>
> 18. jun. 2022 kl. 21:48 skrev Robert Muir <rcmuir@gmail.com>:
>
>
>
> On Sat, Jun 18, 2022, 7:42 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
>>
>> User id mapping is an important consideration for me.
>>
>> Can we find a mapping from Jira user id to GitHub account anywhere?
>
>
> I think we would have to create it. But my hope would be that maybe 50-100 names would cover large majority of issues.
>>
>> Don't we have to gain the consent of each individual to map both accounts?
>
>
> No, we don't have to ask permission to mention someone with an @username
>
>>
>> 2022?6?18?(?) 18:52 Robert Muir <rcmuir@gmail.com>:
>> >
>> > I looked at some related projects on github:
>> > https://github.com/Skraeda/jira-2-github
>> > Does the barebones basics but helps you think of the inputs: "username
>> > mapping", "release -> milestone mapping", etc. Of course for a
>> > username mapping, maybe its best to just handle the top 99% or so and
>> > let the long-tail just come across as "full name". I also find plenty
>> > of projects that convert "special jira language" to markdown, e.g.
>> > https://github.com/catcombo/jira2markdown
>> > I'm not convinced conversion would be degraded, with a little bit of
>> > thought into the conversion, I think it could actually be *better*.
>> > github issues can do everything jira can, just without the fussy UI.
>> > e.g. issues can have attachments (for all the patch files), and
>> > attachment names can have duplicates. Issues can link to other issues,
>> > commits, or PRs easily.
>> >
>> > It just depends on how much we want to invest into it. If we want to
>> > really go whole-hog, then when we do the initial JIRA->issue
>> > conversion, we should *save that mapping* as a .CSV file or similar.
>> > Because later we could then use it to find/replace URLs in
>> > Changes.txt, source code, benchmark annotations, etc etc. Let's at
>> > least leave the possibility open to do that work as followup.
>> >
>> > I find the idea that we're stuck looking at JIRA forever ridiculous.
>> >
>> > On Sat, Jun 18, 2022 at 3:19 AM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>> > >
>> > >
>> > > I honestly don't know what can be done and what has to be sacrificed. I'm pretty sure it'll be more difficult than svn->git conversion because more factors are involved. One tough thing to somehow preserve may be user names (reporters, etc.). I'm not sure how other projects dealt with that.
>> > >
>> > > Perhaps a way to do it incrementally would be to create a json/xml (structured) dump of jira content and then write a converter into a similar json/xml dump for importing into github. I remember it took many iterations and trial and error for svn->git conversion to eventually reach the final shape and it was simpler and faster to do it locally.
>> > >
>> > > Dawid
>> > >
>> > > On Sat, Jun 18, 2022 at 8:59 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
>> > >>
>> > >> I'll give it a try though, I'm really skeptical that it can be done
>> > >> with a satisfactory level of quality (we want to "preserve" issue
>> > >> history, not just to have shallow/degraded copies, right?), and the
>> > >> migration will be significantly delayed to figure out the way to
>> > >> properly moving all issues to GitHub.
>> > >> if there is another way to bypass this challenge - please let me know.
>> > >>
>> > >> Tomoko
>> > >>
>> > >> 2022?6?18?(?) 15:44 Dawid Weiss <dawid.weiss@gmail.com>:
>> > >>
>> > >> >
>> > >> >
>> > >> > Hi Tomoko,
>> > >> >
>> > >> > I've added a few bullet points that script could/should handle under LUCENE-10557, hope you don't mind. If you place these script(s) in the open then perhaps indeed we could try to collaborate and see what can be done.
>> > >> >
>> > >> > Dawid
>> > >> >
>> > >> > On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida <tomoko.uchida.1111@gmail.com> wrote:
>> > >> >>
>> > >> >> Replying to myself - Jira issues can be read via REST API without any
>> > >> >> access token and we can iterate all issues by issue number.
>> > >> >> curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557
>> > >> >>
>> > >> >> Would you please hold the discussion for a while - it's a waste of our
>> > >> >> time without a working prototype to me. I will be back here with a
>> > >> >> sandbox github repo where part of existing jira issues are migrated
>> > >> >> (with the best effort).
>> > >> >> In the process, we could simultaneously figure out the way to operate
>> > >> >> GitHub metadata (milestones/labels).
>> > >> >>
>> > >> >> Tomoko
>> > >> >>
>> > >> >> 2022?6?18?(?) 10:41 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> > >> >>
>> > >> >> >
>> > >> >> > Does anyone have information on API access keys to Jira (preferably,
>> > >> >> > read-only and limited to Lucene project)?
>> > >> >> > https://issues.apache.org/jira/browse/LUCENE-10622
>> > >> >> >
>> > >> >> > 2022?6?18?(?) 10:11 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> > >> >> > >
>> > >> >> > > I feel like we should delay the decision on the mingration of existing
>> > >> >> > > issues until we have a clear image of what can be done and what cannot
>> > >> >> > > be done.
>> > >> >> > >
>> > >> >> > > I'll write some migration script that preserves the issue history as
>> > >> >> > > far as possible - then come back here with some experience.
>> > >> >> > > Let's make a decision upon the concrete knowledge and information.
>> > >> >> > >
>> > >> >> > > Tomoko
>> > >> >> > >
>> > >> >> > > 2022?6?18?(?) 9:26 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> > >> >> > > >
>> > >> >> > > > I don't intend to neglect histories in Jira... it's an important,
>> > >> >> > > > valuable asset for all of us and possible contributors in the future.
>> > >> >> > > >
>> > >> >> > > > It's important, *therefore*, I don't want to have the degraded copies
>> > >> >> > > > of them on GitHub.
>> > >> >> > > > We cannot preserve all of history - again, there should be tons of
>> > >> >> > > > unignorable information losses (timestamp, reporter, assignee,
>> > >> >> > > > markdown, metadata that cannot be ported to GitHub) if we attempt to
>> > >> >> > > > migrate the whole Jira history into Github. Rather than trying to have
>> > >> >> > > > such incomplete copies, I would preserve Jira issues in the perfectly
>> > >> >> > > > archived status, then simply refer to them.
>> > >> >> > > >
>> > >> >> > > > Tomoko
>> > >> >> > > >
>> > >> >> > > > 2022?6?18?(?) 7:47 Gus Heck <gus.heck@gmail.com>:
>> > >> >> > > > >
>> > >> >> > > > > I hope you count me as someone who sees history as important. It's important in more ways than one however. You gave the example of trying to understand something, and looking at the issue history directly. I also give weight to the scenario where someone has written a blog post about the topic and linked the issue "For the latest see LUCENE-XXXX" for example... Or someone planning upgrades has a spreadsheet of things to track down... The existing links should point to a *complete* history of the issue.
>> > >> >> > > > >
>> > >> >> > > > > I don't see the migration of everything to github as being as critical as you do but I'm not at all against migrating things that are closed if someone wants to do that work, and perhaps even copying over existing open issues periodically as they become closed (and accelerating the close rate by aggressive closing of silent issues). No new issues in Jira sounds fine, even better if enforced by Jira. Proceed from here in Github since that's where the community wants to go. Links to the migrated version automatically added to Jira and/or backlinks to Jira would be just fine too since readers might (hopefully needlessly) worry that something didn't get migrated, we should make it easy to check.
>> > >> >> > > > >
>> > >> >> > > > > What I don't want is for someone to land on an issue via link or via google search (or via search in jira because they are using Jira already for some other apache project), read through it and think A) it never got resolved when it did or B) miss the fact that it got reopened and further changes were made and only have half the story... or any other scenario where they are looking at an incomplete record of the issue. (thus obfuscating/splitting the very important rich history across systems).
>> > >> >> > > > >
>> > >> >> > > > > So that's why I feel issues should be completely tracked in the system where they were created. Syncing old closed stuff into a new system probably is fine so long as there are periodic sweeps to pull in reopens or newly completed issues. We could even sync open things so long as they are clearly marked in the title as having their primary record in Jira and "last synced from JIRA on YYYY-MM-DD" or something in a final comment each time new content is brought over.
>> > >> >> > > > >
>> > >> >> > > > > For simplicity and workload however maybe just sync things when they close. Depends on how much effort the person writing code for syncing things wants to put into it I guess.
>> > >> >> > > > >
>> > >> >> > > > > Although I agree with Dawid on the "What if Elon buys it?" issue, that ship has sailed, the community accepts that risk and we probably should not rehash it.
>> > >> >> > > > >
>> > >> >> > > > > WRT Robert's comments on PRs being issues... this has already worried me because I've already seen a lot of discussion on PR's and I've worried that this stuff has the potential to get lost or be hard to find. If there is one key positive of this move is that they will become easier to find since the search in github can find it. I would say that a PR is not a substitute for a well described issue report but that's probably a separate discussion (which I would hope mirrors the policy on small edits like typos or adding comments/javadoc not needing an issue). I've also seen folks who like to clean up and remove old branches and PR's, which is problematic if that's where the important discussion is (possibly a 3rd can of worms there).
>> > >> >> > > > >
>> > >> >> > > > > -Gus
>> > >> >> > > > >
>> > >> >> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcmuir@gmail.com> wrote:
>> > >> >> > > > >>
>> > >> >> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>> > >> >> > > > >> >
>> > >> >> > > > >> > I'd be more afraid of what happens to github issues in two years (or longer). Will it look the same? Will it be different? Will it be gone (and how do we get a backup of the isse history then)? Contrary to the apache-hosted Jira, github is very much an independent entity. If Elon Musk decides to buy and close it tomorrow... then what? :)
>> > >> >> > > > >> >
>> > >> >> > > > >>
>> > >> >> > > > >> We already have a ton of github "issues" (pull requests, since PRs are issues).
>> > >> >> > > > >> If you want to "back them up", its easy, you can paginate thru them
>> > >> >> > > > >> 100 at a time, e.g. run this command, incrementing 'page' until it
>> > >> >> > > > >> returns empty list:
>> > >> >> > > > >>
>> > >> >> > > > >> curl -H "Accept: application/vnd.github.v3+json"
>> > >> >> > > > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all"
>> > >> >> > > > >> > file1.json
>> > >> >> > > > >>
>> > >> >> > > > >> Yeah of course if you want to backup the comments and stuff, you'll
>> > >> >> > > > >> need to do more.
>> > >> >> > > > >> But it is already the case today, that a ton of this "history" is
>> > >> >> > > > >> already in github issues, as PRs. Most recent JIRAs are just useless
>> > >> >> > > > >> placeholders.
>> > >> >> > > > >> Also the same risks apply to JIRA, except are not theoretical and real
>> > >> >> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA to try
>> > >> >> > > > >> to sucker you into their "Atlassian Cloud":
>> > >> >> > > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/
>> > >> >> > > > >>
>> > >> >> > > > >> ---------------------------------------------------------------------
>> > >> >> > > > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >> >> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >> >> > > > >>
>> > >> >> > > > >
>> > >> >> > > > >
>> > >> >> > > > > --
>> > >> >> > > > > http://www.needhamsoftware.com (work)
>> > >> >> > > > > http://www.the111shift.com (play)
>> > >> >>
>> > >> >> ---------------------------------------------------------------------
>> > >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >> >>
>> > >>
>> > >> ---------------------------------------------------------------------
>> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: dev-help@lucene.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

dawid.weiss at gmail

Jun 18, 2022, 11:45 PM

Post #38 of 40 (970 views)

Permalink

> User id mapping is an important consideration for me.
>

Some mapping has to be present somewhere already. Even very old git commits
point at the right people. Perhaps it's based on e-mail addresses or
something?

https://github.com/apache/lucene/commit/5a2615650e104c0713407637d65ae0ce7c2b257a

When the user isn't available, github just shows the nick, without the link.

https://github.com/apache/lucene/commit/89a554ffab239c0118ccd454d76cdf714d793911

Maybe infra could help with how it's done already for git integration.

Dawid

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

msokolov at gmail

Jun 20, 2022, 5:26 AM

Post #39 of 40 (970 views)

Permalink

I think the user mapping must be inferred based on membership in the
Apache "organization" https://github.com/settings/organizations

On Sun, Jun 19, 2022 at 2:45 AM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>
>
>> User id mapping is an important consideration for me.
>
>
> Some mapping has to be present somewhere already. Even very old git commits point at the right people. Perhaps it's based on e-mail addresses or something?
>
> https://github.com/apache/lucene/commit/5a2615650e104c0713407637d65ae0ce7c2b257a
>
> When the user isn't available, github just shows the nick, without the link.
>
> https://github.com/apache/lucene/commit/89a554ffab239c0118ccd454d76cdf714d793911
>
> Maybe infra could help with how it's done already for git integration.
>
> Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira [ In reply to ]

tomoko.uchida.1111 at gmail

Jun 20, 2022, 6:19 AM

Post #40 of 40 (970 views)

Permalink

Thanks for your suggestions; actually ASF should have information on
the account mapping.
For now, I'll just prepare scripts to embed the mapped github accounts
next to the jira author/assignee name; we could ask infra or create
the mapping on our own by inference if we find it's worthwhile to have
it.

I think we can discuss "how" on the issue (LUCENE-10557) - I don't
think there are not so many people who are interested in the full
details of such matters, it's practically important though.

2022?6?20?(?) 21:26 Michael Sokolov <msokolov@gmail.com>:
>
> I think the user mapping must be inferred based on membership in the
> Apache "organization" https://github.com/settings/organizations
>
> On Sun, Jun 19, 2022 at 2:45 AM Dawid Weiss <dawid.weiss@gmail.com> wrote:
> >
> >
> >> User id mapping is an important consideration for me.
> >
> >
> > Some mapping has to be present somewhere already. Even very old git commits point at the right people. Perhaps it's based on e-mail addresses or something?
> >
> > https://github.com/apache/lucene/commit/5a2615650e104c0713407637d65ae0ce7c2b257a
> >
> > When the user isn't available, github just shows the nick, without the link.
> >
> > https://github.com/apache/lucene/commit/89a554ffab239c0118ccd454d76cdf714d793911
> >
> > Maybe infra could help with how it's done already for git integration.
> >
> > Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org