go: store/datas/pull: clone.go: Improve robust of Clone for certain remoteapi implementations when the remote Conjoins.

The clone code works by listing the remote table files and downloading them into the local table file store. When the remote is a remoteapi implementation, like a DoltHub repository, this resulting in listing the remote table files and using URLs to fetch each of them.

The URLs returned from these APIs can expire and they need to be refreshed. This refresh can happen in two ways:

1) There is explicit support in the TableFileSource representation returned by the API to include a mechanism to refresh it. DoltHub uses this, and the Dolt client will make use of that support to refresh expired URLs.
2) The heavy handed approach is to list the table files again and use the newly returned URLs.

The Clone code has explicit support for doing #2, and it is necessary for remoteapi implementations with expiring URLs but without explicit RefreshTableFileUrl support. dolt itself, when running a remote as part of sql-server for example, does not implement RefreshTableFileUrl support, and so the re-list support is still necessary.

This PR changes the Clone implementation so that, on a retry, it makes all the newly returned table file sources available for the next try, but it keeps the old sources around if they no longer come back from ListTableFiles. In this way, we get strictly more robust behavior than before.

The downside is that, when the remote file is actually gone, the Clone code will continue attempting to download it until it reaches a terminal download failure. This change in behavior is not as disruptive as the current behavior, and so we make this new trade off for now.
This commit is contained in:
Aaron Son
2025-06-03 15:59:08 -07:00
parent 7a7030bb30
commit b38567ff35
2 changed files with 38 additions and 3 deletions

View File

@@ -36,6 +36,7 @@ import (
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/trace"
"golang.org/x/sync/errgroup"
"google.golang.org/protobuf/types/known/timestamppb"
remotesapi "github.com/dolthub/dolt/go/gen/proto/dolt/services/remotesapi/v1alpha1"
"github.com/dolthub/dolt/go/libraries/doltcore/remotestorage/internal/reliable"
@@ -1228,6 +1229,17 @@ func (drtf DoltRemoteTableFile) Open(ctx context.Context) (io.ReadCloser, uint64
}
if resp.StatusCode/100 != 2 {
// XXX: If we fail to fetch, and we have the ability
// to refresh the URL, set ourselves up to refresh the
// next time we are used. This can help if the
// remoteapi endpoint gave us a URL that actually
// expires for some reason before the RefreshAfter
// timestamp. (Local clock drift, remote key rotation,
// etc.)
if drtf.info.RefreshAfter != nil {
drtf.info.RefreshAfter = timestamppb.Now()
drtf.info.RefreshAfter.Seconds -= 10
}
defer resp.Body.Close()
body := make([]byte, 4096)
n, _ := io.ReadFull(resp.Body, body)

View File

@@ -204,11 +204,34 @@ func clone(ctx context.Context, srcTS, sinkTS chunks.TableFileStore, sinkCS chun
if failureCount >= maxAttempts {
return err
}
if _, sourceFiles, appendixFiles, err = srcTS.Sources(ctx); err != nil {
if _, refreshedSourceFiles, refreshedAppendixFiles, err := srcTS.Sources(ctx); err != nil {
return err
} else {
tblFiles = filterAppendicesFromSourceFiles(appendixFiles, sourceFiles)
_, fileIDToTF, _ = mapTableFiles(tblFiles)
refreshedTblFiles := filterAppendicesFromSourceFiles(refreshedAppendixFiles, refreshedSourceFiles)
_, refreshedFileIDToTF, _ := mapTableFiles(refreshedTblFiles)
// Sources() will refresh remote table file
// sources with new download URLs. However, it
// will only return URLs for table files which
// are in the remote manifest, which could
// have changed since the clone started. Here
// we keep around any old TableFile instances
// for any TableFiles which have been
// conjoined away or have been the victim of a
// garbage collection run on the remote.
//
// If these files are no longer accessible,
// for example because the URLs expired
// without a RefreshTableFileUrlRequest being
// provided, or because the table files
// themselves have been removed from storage,
// then continuing to use these sources will
// fail termainally eventually. But in the
// case of doltremoteapi on DoltHub, using
// these Sources() will continue to work and
// will allow the Clone to proceed.
for k, v := range refreshedFileIDToTF {
fileIDToTF[k] = v
}
}
}