Compare commits

...

59 Commits

Author SHA1 Message Date
openeuler-ci-bot
4244eb1cd9
!181 [sync] PR-180: [AArch64] Support HiSilicon's HIP09 sched model
From: @openeuler-sync-bot 
Reviewed-by: @eastb233 
Signed-off-by: @eastb233
2024-11-25 03:17:59 +00:00
xiajingze
f08325c08c [AArch64] Support HiSilicon's HIP09 sched model
Signed-off-by: xiajingze <xiajingze1@huawei.com>
(cherry picked from commit 16c2fa56344da079171f6b9f0151b98deed0af91)
2024-11-25 09:40:17 +08:00
openeuler-ci-bot
ad1341bec0
!177 [sync] PR-176: ACPO Infrastructure for ML integration into LLVM compiler
From: @openeuler-sync-bot 
Reviewed-by: @eastb233 
Signed-off-by: @eastb233
2024-11-20 10:50:19 +00:00
eastb233
b3f456737d Find Python3 in default env PATH for ACPO
Sync https://gitee.com/openeuler/llvm-project/pulls/102

(cherry picked from commit 41e289aa66b2e8a068fc4bc21d63661878921d99)
2024-11-20 18:33:41 +08:00
eastb233
8883917445 ACPO Infrastructure for ML integration into LLVM compiler
Sync https://gitee.com/openeuler/llvm-project/pulls/89
and build.sh in https://gitee.com/openeuler/llvm-project/pulls/92

(cherry picked from commit 8f5e9315bb0f90e7efa610731832780d27fe6037)
2024-11-20 18:33:41 +08:00
openeuler-ci-bot
791c03733c
!175 [sync] PR-173: [SimplifyLibCalls] Merge sqrt into the power of exp (#79146) && [LICM] Solve runtime error caused by the signal function.
From: @openeuler-sync-bot 
Reviewed-by: @eastb233 
Signed-off-by: @eastb233
2024-11-20 07:26:43 +00:00
eastb233
d8e386d7fe [LICM] Solve runtime error caused by the signal function.
Sync https://gitee.com/openeuler/llvm-project/pulls/77

(cherry picked from commit fe8d18290462a7e599e706ccf78ffb9ab194dc1a)
2024-11-20 12:03:56 +08:00
eastb233
b38cde7035 [SimplifyLibCalls] Merge sqrt into the power of exp (#79146)
Sync https://gitee.com/openeuler/llvm-project/pulls/76

(cherry picked from commit e595b466af2018f7d739cb6b6ec24697a2bb8464)
2024-11-20 12:03:56 +08:00
openeuler-ci-bot
b974980003
!172 [sync] PR-171: [AArch64] Delete hip09 macro && [backport][Clang] Fix crash with -fzero-call-used-regs
From: @openeuler-sync-bot 
Reviewed-by: @eastb233 
Signed-off-by: @eastb233
2024-11-20 01:19:23 +00:00
xiajingze
a4deed733e [backport][Clang] Fix crash with -fzero-call-used-reg
Signed-off-by: xiajingze <xiajingze1@huawei.com>
(cherry picked from commit d549dcb722349678013666446a2e99d17c69e59c)
2024-11-20 00:07:21 +08:00
xiajingze
05329f777b [AArch64] Delete hip09 macro
Signed-off-by: xiajingze <xiajingze1@huawei.com>
(cherry picked from commit acb1c24cf2a5c512dc7d61dd58eab94f4a33c7cd)
2024-11-20 00:07:21 +08:00
openeuler-ci-bot
4af6572f3a
!170 [sync] PR-164: Add arch restriction for BiSheng Autotuner
From: @openeuler-sync-bot 
Reviewed-by: @eastb233 
Signed-off-by: @eastb233
2024-11-19 10:47:34 +00:00
liyunfei
d54dd2eaf0 Add arch restriction for BiSheng Autotuner
Signed-off-by: liyunfei <liyunfei33@huawei.com>
(cherry picked from commit 8837457e73d3fc458b8c7f4d687c7fc0d6bb57fb)
2024-11-19 18:46:41 +08:00
openeuler-ci-bot
6a68b69cf9
!168 [sync] PR-166: [Backport] Simple check to ignore Inline asm fwait insertion
From: @openeuler-sync-bot 
Reviewed-by: @eastb233 
Signed-off-by: @eastb233
2024-11-19 08:49:49 +00:00
liyunfei
45ff6f68c3 [Backport] Simple check to ignore Inline asm fwait insertion
Signed-off-by: liyunfei <liyunfei33@huawei.com>
(cherry picked from commit 9083dc26a0b2c3eb6f80adcd469a6181287bbf6e)
2024-11-19 15:18:11 +08:00
openeuler-ci-bot
8943f6328a
!157 [sync] PR-154: [Backport][LoongArch] Fix and add some new support
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-10-12 00:34:05 +00:00
Ami-zhang
623acebaa6 [Backport][LoongArch] Fix and add some new support
(cherry picked from commit f29e3618e267533beb40d05dc5974488ce5f7847)
2024-10-11 17:49:17 +08:00
openeuler-ci-bot
f7be2638ca
!156 [sync] PR-155: Fix the issue that the date in the changelog is not sorted in descending order.
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-10-11 06:38:30 +00:00
cf-zhao
80bf5380a0 Fix the issue that the date in the changelog is not sorted in
descending order.

(cherry picked from commit c7e8c735ce72bb09fe47dbbd7f53722ff6cd9183)
2024-10-11 09:59:26 +08:00
openeuler-ci-bot
a0725de2c4
!153 [sync] PR-148: [AArch64] Support HiSilicon's HIP09 Processor
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-13 07:36:59 +00:00
xiajingze
a8cfa61489 [AArch64] Support HiSilicon's HIP09 Processor
(cherry picked from commit 95487e968ff91c07708ed07075820405d5a8b960)
2024-09-12 21:49:19 +08:00
openeuler-ci-bot
4d72b8e986
!152 [sync] PR-151: doc add Provides llvm-help
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-12 11:23:11 +00:00
hongjinghao
406b179b49 doc add Provides llvm-help
(cherry picked from commit 329d4ab7997e1c4e427b311db986be7215503834)
2024-09-12 09:31:03 +08:00
openeuler-ci-bot
552f089b68
!149 [sync] PR-146: doc add Obsoletes llvm-help
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-09-11 01:34:33 +00:00
hongjinghao
3e4b4fcc87 doc add Obsoletes llvm-help
(cherry picked from commit 865578178c9bbf96e725dd37b34b9cef587df281)
2024-09-10 16:42:19 +08:00
openeuler-ci-bot
b76688231e
!145 [sync] PR-141: mv man to doc subpackage
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-10 01:23:01 +00:00
hongjinghao
99903bd672 mv man to doc subpackage
(cherry picked from commit cd66419e417f492591824298d239e51782750ee5)
2024-09-06 21:26:07 +08:00
openeuler-ci-bot
590b980ffc
!144 [sync] PR-140: Prevent environment variables from exceeding NAME_MAX.
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-06 13:25:56 +00:00
liyunfei
f3670876e5 Prevent environment variables from exceeding NAME_MAX
(cherry picked from commit 4996d19ff15e056b817dc50067b6ea52669c5f24)
2024-09-06 16:17:36 +08:00
openeuler-ci-bot
e5ad654e46
!139 [sync] PR-135: Disable toolchain_clang build for BiSheng Autotuner support temporary
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-03 11:11:22 +00:00
liyunfei
069f629df7 Disable toolchain_clang build
Disable toolchain_clang build for BiSheng Autotuner support temporary.

Signed-off-by: liyunfei <liyunfei33@huawei.com>
(cherry picked from commit 4dce262ee770ebe675eaf42d50b13b54348af5e9)
2024-09-03 17:39:19 +08:00
openeuler-ci-bot
520e062268
!138 [sync] PR-125: Add BiSheng Autotuner support
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-03 09:38:35 +00:00
liyunfei
aca9c09071 Add BiSheng Autotuner support
Signed-off-by: liyunfei <liyunfei33@huawei.com>
(cherry picked from commit 776d4f615bc87ec2775ee7e94c44e51ca56da0b2)
2024-09-03 16:31:51 +08:00
openeuler-ci-bot
1087136411
!137 [sync] PR-134: Add toolchain_clang build support
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-03 08:31:11 +00:00
liyunfei
88f3e45ed8 Add toolchain_clang build support
Signed-off-by: liyunfei <liyunfei33@huawei.com>
(cherry picked from commit 02ab7ced7eea43a784dce393877e5c1d7210460d)
2024-09-03 15:01:48 +08:00
openeuler-ci-bot
a5110dd911
!132 [sync] PR-129: Revert "Support stack clash protection"
From: @openeuler-sync-bot 
Reviewed-by: @liyunfei33 
Signed-off-by: @liyunfei33
2024-09-03 06:32:36 +00:00
cf-zhao
51f4a7d312 Revert "Support stack clash protection"
This reverts commit 4f4298791f15f26e0649f57c6edfd999af51ec41.

(cherry picked from commit f9af047c9f0602b71489d2f042fecdbe22ae100f)
2024-05-20 09:04:51 +08:00
openeuler-ci-bot
6d7af1becf
!127 [sync] PR-124: Support stack clash protection
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-05-17 02:37:08 +00:00
rickyleung
700751006e Support stack clash protection
(cherry picked from commit 4f4298791f15f26e0649f57c6edfd999af51ec41)
2024-05-13 15:00:04 +08:00
openeuler-ci-bot
ede988ff44
!123 [sync] PR-121: Update llvm-lit config to support macro build_for_openeuler
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-05-03 00:00:07 +00:00
wangqiang
9327ad0fab Update llvm-lit config to support macro build_for_openeuler
(cherry picked from commit 2b03ba072ed723b232d1b29a1be921b2536de495)
2024-05-01 19:19:45 +08:00
openeuler-ci-bot
1a7a96c62f
!119 [sync] PR-113: [Backport][LoongArch] Improve the support for atomic and clear_cache
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-28 02:57:11 +00:00
Ami-zhang
73499e9115 [Backport][LoongArch] Improve the support for atomic and clear_cache
(cherry picked from commit 374a99221881e29f75891c82a38acc4ba65a17e1)
2024-04-26 09:32:57 +08:00
openeuler-ci-bot
a8517f9424
!115 [sync] PR-82: [ClassicFlang] Add the support for classic flang.
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-26 01:14:54 +00:00
luofeng14
29111d6e55 Add the support for classic-flang
(cherry picked from commit c18d0cd9f75f5d7c05818f1dcaef6a3a6ea33232)
2024-04-19 16:00:48 +08:00
openeuler-ci-bot
5b03eda1dd
!112 [sync] PR-110: fix some typo
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-16 02:18:00 +00:00
liyunfei
cb05b56921 fix some typo
(cherry picked from commit ae7f796c92df9a13db28974bc99921eb922c2ef5)
2024-04-16 10:17:33 +08:00
openeuler-ci-bot
df9493e61b
!109 [sync] PR-105: Backport patch to fix CVE-2024-31852
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-15 09:44:48 +00:00
liyunfei
ce0a2fbef7 Backport patch to fix CVE-2024-31852
reference:
b1a5ee1feb
and
749384c08e

Signed-off-by: liyunfei <liyunfei33@huawei.com>
(cherry picked from commit 332fb03bb7615eb0f23f10d953b8b168a8c319a1)
2024-04-15 14:42:28 +08:00
openeuler-ci-bot
80348b01a4
!107 [sync] PR-97: [Backport][X86][Inline] Skip inline asm in inlining target feature check
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-15 01:18:02 +00:00
wangqiang
f6bb1b6dd7 [Backport][X86][Inline] Skip inline asm in inlining target feature check
reference: 8c6015db59

Signed-off-by: wangqiang <wangqiang1@kylinos.cn>
(cherry picked from commit 4606ba60e6d02da53420f910ac047108c3b88c48)
2024-04-13 17:30:06 +08:00
openeuler-ci-bot
410a2c1e56
!101 [sync] PR-87: Backport patch to fix CVE-2023-46049
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-13 05:50:31 +00:00
liyunfei
ab7ce29778 Backport patch to fix CVE-2023-46049
reference:c2515a8f2b

Signed-off-by: liyunfei <liyunfei33@huawei.com>
(cherry picked from commit 86e71f1261e4179e36923e08d2b518016a77b835)
2024-04-12 14:46:34 +08:00
openeuler-ci-bot
b78fcfa91d
!99 [sync] PR-86: [Backport][LoongArch] Improve the support for compiler-rt and bugfix
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-12 06:23:52 +00:00
Ami-zhang
5eae06c472 [Backport][LoongArch] Improve the support for compiler-rt and bugfix
(cherry picked from commit b86be6988c4e9c6b488369fe6d74a274827ae1a9)
2024-04-11 20:08:54 +08:00
openeuler-ci-bot
e8b73631a6
!92 [sync] PR-85: [Backport][LoongArch] Add the support for vector in llvm17
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-11 12:06:25 +00:00
Ami-zhang
190a438b45 [Backport][LoongArch] Add the support for vector in llvm17
(cherry picked from commit 52d71959de0a5dca6d4dde05a8223685df449d1c)
2024-04-11 09:29:10 +08:00
openeuler-ci-bot
dea14417fd
!90 [sync] PR-84: [Backport][LoongArch] Support relax feature on LoongArch in llvm17
From: @openeuler-sync-bot 
Reviewed-by: @cf-zhao 
Signed-off-by: @cf-zhao
2024-04-11 01:17:28 +00:00
Ami-zhang
d72a7d64b1 [Backport][LoongArch] Support relax feature
(cherry picked from commit 8cf6fb10e49140722443997596f2ae55bec9d525)
2024-04-10 16:17:52 +08:00
35 changed files with 104492 additions and 4 deletions

View File

@ -0,0 +1,178 @@
From 6f135b13769c64a6942b4b232a350b6a6207f2b2 Mon Sep 17 00:00:00 2001
From: Jinyang He <hejinyang@loongson.cn>
Date: Thu, 16 Nov 2023 11:01:26 +0800
Subject: [PATCH 02/14] [LoongArch] Add relax feature and keep relocations
(#72191)
Add relax feature. To support linker relocation, we should make
relocation with a symbol rather than section plus offset, and keep all
relocations with non-abs symbol.
(cherry picked from commit f5bfc833fcbf17a5876911783d1adaca7028d20c)
Change-Id: Ief38b480016175f2cc9939b74a84d9444559ffd6
---
llvm/lib/Target/LoongArch/LoongArch.td | 4 +++
.../lib/Target/LoongArch/LoongArchSubtarget.h | 2 ++
.../MCTargetDesc/LoongArchAsmBackend.cpp | 5 +--
.../MCTargetDesc/LoongArchELFObjectWriter.cpp | 18 ++++++++---
.../MCTargetDesc/LoongArchMCTargetDesc.h | 2 +-
.../MC/LoongArch/Relocations/relax-attr.s | 32 +++++++++++++++++++
6 files changed, 55 insertions(+), 8 deletions(-)
create mode 100644 llvm/test/MC/LoongArch/Relocations/relax-attr.s
diff --git a/llvm/lib/Target/LoongArch/LoongArch.td b/llvm/lib/Target/LoongArch/LoongArch.td
index 0675caa3b601..75b65fe69f26 100644
--- a/llvm/lib/Target/LoongArch/LoongArch.td
+++ b/llvm/lib/Target/LoongArch/LoongArch.td
@@ -102,6 +102,10 @@ def FeatureUAL
: SubtargetFeature<"ual", "HasUAL", "true",
"Allow memory accesses to be unaligned">;
+def FeatureRelax
+ : SubtargetFeature<"relax", "HasLinkerRelax", "true",
+ "Enable Linker relaxation">;
+
//===----------------------------------------------------------------------===//
// Registers, instruction descriptions ...
//===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/LoongArch/LoongArchSubtarget.h b/llvm/lib/Target/LoongArch/LoongArchSubtarget.h
index 0fbe23f2f62d..5c173675cca4 100644
--- a/llvm/lib/Target/LoongArch/LoongArchSubtarget.h
+++ b/llvm/lib/Target/LoongArch/LoongArchSubtarget.h
@@ -43,6 +43,7 @@ class LoongArchSubtarget : public LoongArchGenSubtargetInfo {
bool HasLaGlobalWithAbs = false;
bool HasLaLocalWithAbs = false;
bool HasUAL = false;
+ bool HasLinkerRelax = false;
unsigned GRLen = 32;
MVT GRLenVT = MVT::i32;
LoongArchABI::ABI TargetABI = LoongArchABI::ABI_Unknown;
@@ -100,6 +101,7 @@ public:
bool hasLaGlobalWithAbs() const { return HasLaGlobalWithAbs; }
bool hasLaLocalWithAbs() const { return HasLaLocalWithAbs; }
bool hasUAL() const { return HasUAL; }
+ bool hasLinkerRelax() const { return HasLinkerRelax; }
MVT getGRLenVT() const { return GRLenVT; }
unsigned getGRLen() const { return GRLen; }
LoongArchABI::ABI getTargetABI() const { return TargetABI; }
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index ecb68ff401e9..aae3e544d326 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -168,7 +168,7 @@ bool LoongArchAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
return true;
switch (Fixup.getTargetKind()) {
default:
- return false;
+ return STI.hasFeature(LoongArch::FeatureRelax);
case FK_Data_1:
case FK_Data_2:
case FK_Data_4:
@@ -193,7 +193,8 @@ bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
std::unique_ptr<MCObjectTargetWriter>
LoongArchAsmBackend::createObjectTargetWriter() const {
- return createLoongArchELFObjectWriter(OSABI, Is64Bit);
+ return createLoongArchELFObjectWriter(
+ OSABI, Is64Bit, STI.hasFeature(LoongArch::FeatureRelax));
}
MCAsmBackend *llvm::createLoongArchAsmBackend(const Target &T,
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchELFObjectWriter.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchELFObjectWriter.cpp
index a6b9c0652639..e60b9c2cfd97 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchELFObjectWriter.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchELFObjectWriter.cpp
@@ -20,19 +20,27 @@ using namespace llvm;
namespace {
class LoongArchELFObjectWriter : public MCELFObjectTargetWriter {
public:
- LoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit);
+ LoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit, bool EnableRelax);
~LoongArchELFObjectWriter() override;
+ bool needsRelocateWithSymbol(const MCSymbol &Sym,
+ unsigned Type) const override {
+ return EnableRelax;
+ }
+
protected:
unsigned getRelocType(MCContext &Ctx, const MCValue &Target,
const MCFixup &Fixup, bool IsPCRel) const override;
+ bool EnableRelax;
};
} // end namespace
-LoongArchELFObjectWriter::LoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit)
+LoongArchELFObjectWriter::LoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit,
+ bool EnableRelax)
: MCELFObjectTargetWriter(Is64Bit, OSABI, ELF::EM_LOONGARCH,
- /*HasRelocationAddend*/ true) {}
+ /*HasRelocationAddend=*/true),
+ EnableRelax(EnableRelax) {}
LoongArchELFObjectWriter::~LoongArchELFObjectWriter() {}
@@ -87,6 +95,6 @@ unsigned LoongArchELFObjectWriter::getRelocType(MCContext &Ctx,
}
std::unique_ptr<MCObjectTargetWriter>
-llvm::createLoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit) {
- return std::make_unique<LoongArchELFObjectWriter>(OSABI, Is64Bit);
+llvm::createLoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit, bool Relax) {
+ return std::make_unique<LoongArchELFObjectWriter>(OSABI, Is64Bit, Relax);
}
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCTargetDesc.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCTargetDesc.h
index ab35a0096c8a..bb05baa9b717 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCTargetDesc.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCTargetDesc.h
@@ -36,7 +36,7 @@ MCAsmBackend *createLoongArchAsmBackend(const Target &T,
const MCTargetOptions &Options);
std::unique_ptr<MCObjectTargetWriter>
-createLoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit);
+createLoongArchELFObjectWriter(uint8_t OSABI, bool Is64Bit, bool Relax);
} // end namespace llvm
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-attr.s b/llvm/test/MC/LoongArch/Relocations/relax-attr.s
new file mode 100644
index 000000000000..b1e648d850bb
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/relax-attr.s
@@ -0,0 +1,32 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 %s -o %t
+# RUN: llvm-readobj -r %t | FileCheck %s
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 -mattr=+relax %s -o %t
+# RUN: llvm-readobj -r %t | FileCheck %s --check-prefix=CHECKR
+
+# CHECK: Relocations [
+# CHECK-NEXT: Section ({{.*}}) .rela.data {
+# CHECK-NEXT: 0x0 R_LARCH_64 .text 0x4
+# CHECK-NEXT: }
+# CHECK-NEXT: ]
+
+# CHECKR: Relocations [
+# CHECKR-NEXT: Section ({{.*}}) .rela.text {
+# CHECKR-NEXT: 0x8 R_LARCH_B21 .L1 0x0
+# CHECKR-NEXT: 0xC R_LARCH_B16 .L1 0x0
+# CHECKR-NEXT: 0x10 R_LARCH_B26 .L1 0x0
+# CHECKR-NEXT: }
+# CHECKR-NEXT: Section ({{.*}}) .rela.data {
+# CHECKR-NEXT: 0x0 R_LARCH_64 .L1 0x0
+# CHECKR-NEXT: }
+# CHECKR-NEXT: ]
+
+.text
+ nop
+.L1:
+ nop
+ beqz $a0, .L1
+ blt $a0, $a1, .L1
+ b .L1
+
+.data
+.dword .L1
--
2.20.1

View File

@ -0,0 +1,299 @@
From 77d74b8fa071fa2695c9782e2e63e7b930895b1b Mon Sep 17 00:00:00 2001
From: Jinyang He <hejinyang@loongson.cn>
Date: Wed, 20 Dec 2023 10:54:51 +0800
Subject: [PATCH 03/14] [LoongArch] Allow delayed decision for ADD/SUB
relocations (#72960)
Refer to RISCV [1], LoongArch also need delayed decision for ADD/SUB
relocations. In handleAddSubRelocations, just return directly if SecA !=
SecB, handleFixup usually will finish the the rest of creating PCRel
relocations works. Otherwise we emit relocs depends on whether
relaxation is enabled. If not, we return true and avoid record ADD/SUB
relocations.
Now the two symbols separated by alignment directive will return without
folding symbol offset in AttemptToFoldSymbolOffsetDifference, which has
the same effect when relaxation is enabled.
[1] https://reviews.llvm.org/D155357
(cherry picked from commit a8081ed8ff0fd11fb8d5f4c83df49da909e49612)
Change-Id: Ic4c6a3eb11b576cb0c6ed0eba02150ad67c33cf2
---
llvm/lib/MC/MCExpr.cpp | 3 +-
.../MCTargetDesc/LoongArchAsmBackend.cpp | 78 +++++++++++++++++++
.../MCTargetDesc/LoongArchAsmBackend.h | 9 ++-
.../MCTargetDesc/LoongArchFixupKinds.h | 4 +-
llvm/test/MC/LoongArch/Misc/subsection.s | 38 +++++++++
.../MC/LoongArch/Relocations/relax-addsub.s | 68 ++++++++++++++++
6 files changed, 196 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/MC/LoongArch/Misc/subsection.s
create mode 100644 llvm/test/MC/LoongArch/Relocations/relax-addsub.s
diff --git a/llvm/lib/MC/MCExpr.cpp b/llvm/lib/MC/MCExpr.cpp
index a7b980553af0..5a6596f93824 100644
--- a/llvm/lib/MC/MCExpr.cpp
+++ b/llvm/lib/MC/MCExpr.cpp
@@ -635,7 +635,8 @@ static void AttemptToFoldSymbolOffsetDifference(
// instructions and InSet is false (not expressions in directive like
// .size/.fill), disable the fast path.
if (Layout && (InSet || !SecA.hasInstructions() ||
- !Asm->getContext().getTargetTriple().isRISCV())) {
+ !(Asm->getContext().getTargetTriple().isRISCV() ||
+ Asm->getContext().getTargetTriple().isLoongArch()))) {
// If both symbols are in the same fragment, return the difference of their
// offsets. canGetFragmentOffset(FA) may be false.
if (FA == FB && !SA.isVariable() && !SB.isVariable()) {
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index aae3e544d326..1ed047a8e632 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -177,6 +177,34 @@ bool LoongArchAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
}
}
+static inline std::pair<MCFixupKind, MCFixupKind>
+getRelocPairForSize(unsigned Size) {
+ switch (Size) {
+ default:
+ llvm_unreachable("unsupported fixup size");
+ case 6:
+ return std::make_pair(
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD6),
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB6));
+ case 8:
+ return std::make_pair(
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD8),
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB8));
+ case 16:
+ return std::make_pair(
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD16),
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB16));
+ case 32:
+ return std::make_pair(
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD32),
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB32));
+ case 64:
+ return std::make_pair(
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD64),
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB64));
+ }
+}
+
bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const {
// We mostly follow binutils' convention here: align to 4-byte boundary with a
@@ -191,6 +219,56 @@ bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
return true;
}
+bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
+ const MCFragment &F,
+ const MCFixup &Fixup,
+ const MCValue &Target,
+ uint64_t &FixedValue) const {
+ std::pair<MCFixupKind, MCFixupKind> FK;
+ uint64_t FixedValueA, FixedValueB;
+ const MCSection &SecA = Target.getSymA()->getSymbol().getSection();
+ const MCSection &SecB = Target.getSymB()->getSymbol().getSection();
+
+ // We need record relocation if SecA != SecB. Usually SecB is same as the
+ // section of Fixup, which will be record the relocation as PCRel. If SecB
+ // is not same as the section of Fixup, it will report error. Just return
+ // false and then this work can be finished by handleFixup.
+ if (&SecA != &SecB)
+ return false;
+
+ // In SecA == SecB case. If the linker relaxation is enabled, we need record
+ // the ADD, SUB relocations. Otherwise the FixedValue has already been
+ // calculated out in evaluateFixup, return true and avoid record relocations.
+ if (!STI.hasFeature(LoongArch::FeatureRelax))
+ return true;
+
+ switch (Fixup.getKind()) {
+ case llvm::FK_Data_1:
+ FK = getRelocPairForSize(8);
+ break;
+ case llvm::FK_Data_2:
+ FK = getRelocPairForSize(16);
+ break;
+ case llvm::FK_Data_4:
+ FK = getRelocPairForSize(32);
+ break;
+ case llvm::FK_Data_8:
+ FK = getRelocPairForSize(64);
+ break;
+ default:
+ llvm_unreachable("unsupported fixup size");
+ }
+ MCValue A = MCValue::get(Target.getSymA(), nullptr, Target.getConstant());
+ MCValue B = MCValue::get(Target.getSymB());
+ auto FA = MCFixup::create(Fixup.getOffset(), nullptr, std::get<0>(FK));
+ auto FB = MCFixup::create(Fixup.getOffset(), nullptr, std::get<1>(FK));
+ auto &Asm = Layout.getAssembler();
+ Asm.getWriter().recordRelocation(Asm, Layout, &F, FA, A, FixedValueA);
+ Asm.getWriter().recordRelocation(Asm, Layout, &F, FB, B, FixedValueB);
+ FixedValue = FixedValueA - FixedValueB;
+ return true;
+}
+
std::unique_ptr<MCObjectTargetWriter>
LoongArchAsmBackend::createObjectTargetWriter() const {
return createLoongArchELFObjectWriter(
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
index ae9bb8af0419..20f25b5cf53b 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
@@ -31,10 +31,15 @@ class LoongArchAsmBackend : public MCAsmBackend {
public:
LoongArchAsmBackend(const MCSubtargetInfo &STI, uint8_t OSABI, bool Is64Bit,
const MCTargetOptions &Options)
- : MCAsmBackend(support::little), STI(STI), OSABI(OSABI), Is64Bit(Is64Bit),
- TargetOptions(Options) {}
+ : MCAsmBackend(support::little,
+ LoongArch::fixup_loongarch_relax),
+ STI(STI), OSABI(OSABI), Is64Bit(Is64Bit), TargetOptions(Options) {}
~LoongArchAsmBackend() override {}
+ bool handleAddSubRelocations(const MCAsmLayout &Layout, const MCFragment &F,
+ const MCFixup &Fixup, const MCValue &Target,
+ uint64_t &FixedValue) const override;
+
void applyFixup(const MCAssembler &Asm, const MCFixup &Fixup,
const MCValue &Target, MutableArrayRef<char> Data,
uint64_t Value, bool IsResolved,
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
index ba2d6718cdf9..178fa6e5262b 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
@@ -106,7 +106,9 @@ enum Fixups {
// 20-bit fixup corresponding to %gd_pc_hi20(foo) for instruction pcalau12i.
fixup_loongarch_tls_gd_pc_hi20,
// 20-bit fixup corresponding to %gd_hi20(foo) for instruction lu12i.w.
- fixup_loongarch_tls_gd_hi20
+ fixup_loongarch_tls_gd_hi20,
+ // Generate an R_LARCH_RELAX which indicates the linker may relax here.
+ fixup_loongarch_relax = FirstLiteralRelocationKind + ELF::R_LARCH_RELAX
};
} // end namespace LoongArch
} // end namespace llvm
diff --git a/llvm/test/MC/LoongArch/Misc/subsection.s b/llvm/test/MC/LoongArch/Misc/subsection.s
new file mode 100644
index 000000000000..0bd22b474536
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Misc/subsection.s
@@ -0,0 +1,38 @@
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o /dev/null 2>&1 | FileCheck %s --check-prefixes=ERR,NORELAX --implicit-check-not=error:
+## TODO: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o /dev/null 2>&1 | FileCheck %s --check-prefixes=ERR,RELAX --implicit-check-not=error:
+
+a:
+ nop
+b:
+ la.pcrel $t0, a
+c:
+ nop
+d:
+
+.data
+## Positive subsection numbers
+## With relaxation, report an error as c-b is not an assemble-time constant.
+# RELAX: :[[#@LINE+1]]:14: error: cannot evaluate subsection number
+.subsection c-b
+# RELAX: :[[#@LINE+1]]:14: error: cannot evaluate subsection number
+.subsection d-b
+# RELAX: :[[#@LINE+1]]:14: error: cannot evaluate subsection number
+.subsection c-a
+
+.subsection b-a
+.subsection d-c
+
+## Negative subsection numbers
+# NORELAX: :[[#@LINE+2]]:14: error: subsection number -8 is not within [0,2147483647]
+# RELAX: :[[#@LINE+1]]:14: error: cannot evaluate subsection number
+.subsection b-c
+# NORELAX: :[[#@LINE+2]]:14: error: subsection number -12 is not within [0,2147483647]
+# RELAX: :[[#@LINE+1]]:14: error: cannot evaluate subsection number
+.subsection b-d
+# NORELAX: :[[#@LINE+2]]:14: error: subsection number -12 is not within [0,2147483647]
+# RELAX: :[[#@LINE+1]]:14: error: cannot evaluate subsection number
+.subsection a-c
+# ERR: :[[#@LINE+1]]:14: error: subsection number -4 is not within [0,2147483647]
+.subsection a-b
+# ERR: :[[#@LINE+1]]:14: error: subsection number -4 is not within [0,2147483647]
+.subsection c-d
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
new file mode 100644
index 000000000000..532eb4e0561a
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -0,0 +1,68 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s \
+# RUN: | llvm-readobj -r -x .data - | FileCheck %s --check-prefix=NORELAX
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s \
+# RUN: | llvm-readobj -r -x .data - | FileCheck %s --check-prefix=RELAX
+
+# NORELAX: Relocations [
+# NORELAX-NEXT: Section ({{.*}}) .rela.text {
+# NORELAX-NEXT: 0x10 R_LARCH_PCALA_HI20 .text 0x0
+# NORELAX-NEXT: 0x14 R_LARCH_PCALA_LO12 .text 0x0
+# NORELAX-NEXT: }
+# NORELAX-NEXT: ]
+
+# NORELAX: Hex dump of section '.data':
+# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 0000000c
+# NORELAX-NEXT: 0x00000010 0c000c00 00000c00 00000000 00000808
+# NORELAX-NEXT: 0x00000020 00080000 00080000 00000000 00
+
+# RELAX: Relocations [
+# RELAX-NEXT: Section ({{.*}}) .rela.text {
+# RELAX-NEXT: 0x10 R_LARCH_PCALA_HI20 .L1 0x0
+# RELAX-NEXT: 0x14 R_LARCH_PCALA_LO12 .L1 0x0
+# RELAX-NEXT: }
+# RELAX-NEXT: Section ({{.*}}) .rela.data {
+# RELAX-NEXT: 0xF R_LARCH_ADD8 .L3 0x0
+# RELAX-NEXT: 0xF R_LARCH_SUB8 .L2 0x0
+# RELAX-NEXT: 0x10 R_LARCH_ADD16 .L3 0x0
+# RELAX-NEXT: 0x10 R_LARCH_SUB16 .L2 0x0
+# RELAX-NEXT: 0x12 R_LARCH_ADD32 .L3 0x0
+# RELAX-NEXT: 0x12 R_LARCH_SUB32 .L2 0x0
+# RELAX-NEXT: 0x16 R_LARCH_ADD64 .L3 0x0
+# RELAX-NEXT: 0x16 R_LARCH_SUB64 .L2 0x0
+# RELAX-NEXT: }
+# RELAX-NEXT: ]
+
+# RELAX: Hex dump of section '.data':
+# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000000
+# RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000808
+# RELAX-NEXT: 0x00000020 00080000 00080000 00000000 00
+
+.text
+.L1:
+ nop
+.L2:
+ .align 4
+.L3:
+ la.pcrel $t0, .L1
+.L4:
+ ret
+
+.data
+## Not emit relocs
+.byte .L2 - .L1
+.short .L2 - .L1
+.word .L2 - .L1
+.dword .L2 - .L1
+## With relaxation, emit relocs because of the .align making the diff variable.
+## TODO Handle alignment directive. Why they emit relocs now? They returns
+## without folding symbols offset in AttemptToFoldSymbolOffsetDifference().
+.byte .L3 - .L2
+.short .L3 - .L2
+.word .L3 - .L2
+.dword .L3 - .L2
+## TODO
+## With relaxation, emit relocs because la.pcrel is a linker-relaxable inst.
+.byte .L4 - .L3
+.short .L4 - .L3
+.word .L4 - .L3
+.dword .L4 - .L3
--
2.20.1

View File

@ -0,0 +1,364 @@
From f2495d7efb79fdc82af6147f7201d9cf3c91beba Mon Sep 17 00:00:00 2001
From: Jinyang He <hejinyang@loongson.cn>
Date: Wed, 27 Dec 2023 08:51:48 +0800
Subject: [PATCH 04/14] [LoongArch] Emit R_LARCH_RELAX when expanding some
LoadAddress (#72961)
Emit relax relocs when expand non-large la.pcrel and non-large la.got on
llvm-mc stage, which like what does on GAS.
1, la.pcrel -> PCALA_HI20 + RELAX + PCALA_LO12 + RELAX
2, la.got -> GOT_PC_HI20 + RELAX + GOT_PC_LO12 + RELAX
(cherry picked from commit b3ef8dce9811b2725639b0d4fac3f85c7e112817)
Change-Id: I222daf60b36ee70e23c76b753e1d2a3b8148f44b
---
.../AsmParser/LoongArchAsmParser.cpp | 12 +--
.../MCTargetDesc/LoongArchMCCodeEmitter.cpp | 13 +++
.../MCTargetDesc/LoongArchMCExpr.cpp | 7 +-
.../LoongArch/MCTargetDesc/LoongArchMCExpr.h | 8 +-
llvm/test/MC/LoongArch/Macros/macros-la.s | 84 ++++++++++++++++---
llvm/test/MC/LoongArch/Misc/subsection.s | 2 +-
.../MC/LoongArch/Relocations/relax-addsub.s | 16 +++-
7 files changed, 115 insertions(+), 27 deletions(-)
diff --git a/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp b/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp
index 94d530306536..a132e645c864 100644
--- a/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp
+++ b/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp
@@ -86,7 +86,7 @@ class LoongArchAsmParser : public MCTargetAsmParser {
// "emitLoadAddress*" functions.
void emitLAInstSeq(MCRegister DestReg, MCRegister TmpReg,
const MCExpr *Symbol, SmallVectorImpl<Inst> &Insts,
- SMLoc IDLoc, MCStreamer &Out);
+ SMLoc IDLoc, MCStreamer &Out, bool RelaxHint = false);
// Helper to emit pseudo instruction "la.abs $rd, sym".
void emitLoadAddressAbs(MCInst &Inst, SMLoc IDLoc, MCStreamer &Out);
@@ -749,12 +749,14 @@ bool LoongArchAsmParser::ParseInstruction(ParseInstructionInfo &Info,
void LoongArchAsmParser::emitLAInstSeq(MCRegister DestReg, MCRegister TmpReg,
const MCExpr *Symbol,
SmallVectorImpl<Inst> &Insts,
- SMLoc IDLoc, MCStreamer &Out) {
+ SMLoc IDLoc, MCStreamer &Out,
+ bool RelaxHint) {
MCContext &Ctx = getContext();
for (LoongArchAsmParser::Inst &Inst : Insts) {
unsigned Opc = Inst.Opc;
LoongArchMCExpr::VariantKind VK = Inst.VK;
- const LoongArchMCExpr *LE = LoongArchMCExpr::create(Symbol, VK, Ctx);
+ const LoongArchMCExpr *LE =
+ LoongArchMCExpr::create(Symbol, VK, Ctx, RelaxHint);
switch (Opc) {
default:
llvm_unreachable("unexpected opcode");
@@ -855,7 +857,7 @@ void LoongArchAsmParser::emitLoadAddressPcrel(MCInst &Inst, SMLoc IDLoc,
Insts.push_back(
LoongArchAsmParser::Inst(ADDI, LoongArchMCExpr::VK_LoongArch_PCALA_LO12));
- emitLAInstSeq(DestReg, DestReg, Symbol, Insts, IDLoc, Out);
+ emitLAInstSeq(DestReg, DestReg, Symbol, Insts, IDLoc, Out, true);
}
void LoongArchAsmParser::emitLoadAddressPcrelLarge(MCInst &Inst, SMLoc IDLoc,
@@ -901,7 +903,7 @@ void LoongArchAsmParser::emitLoadAddressGot(MCInst &Inst, SMLoc IDLoc,
Insts.push_back(
LoongArchAsmParser::Inst(LD, LoongArchMCExpr::VK_LoongArch_GOT_PC_LO12));
- emitLAInstSeq(DestReg, DestReg, Symbol, Insts, IDLoc, Out);
+ emitLAInstSeq(DestReg, DestReg, Symbol, Insts, IDLoc, Out, true);
}
void LoongArchAsmParser::emitLoadAddressGotLarge(MCInst &Inst, SMLoc IDLoc,
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
index 03fb9e008ae9..08c0820cb862 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
@@ -19,6 +19,7 @@
#include "llvm/MC/MCInstBuilder.h"
#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCRegisterInfo.h"
+#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/EndianStream.h"
@@ -120,12 +121,15 @@ LoongArchMCCodeEmitter::getExprOpValue(const MCInst &MI, const MCOperand &MO,
SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {
assert(MO.isExpr() && "getExprOpValue expects only expressions");
+ bool RelaxCandidate = false;
+ bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax);
const MCExpr *Expr = MO.getExpr();
MCExpr::ExprKind Kind = Expr->getKind();
LoongArch::Fixups FixupKind = LoongArch::fixup_loongarch_invalid;
if (Kind == MCExpr::Target) {
const LoongArchMCExpr *LAExpr = cast<LoongArchMCExpr>(Expr);
+ RelaxCandidate = LAExpr->getRelaxHint();
switch (LAExpr->getKind()) {
case LoongArchMCExpr::VK_LoongArch_None:
case LoongArchMCExpr::VK_LoongArch_Invalid:
@@ -269,6 +273,15 @@ LoongArchMCCodeEmitter::getExprOpValue(const MCInst &MI, const MCOperand &MO,
Fixups.push_back(
MCFixup::create(0, Expr, MCFixupKind(FixupKind), MI.getLoc()));
+
+ // Emit an R_LARCH_RELAX if linker relaxation is enabled and LAExpr has relax
+ // hint.
+ if (EnableRelax && RelaxCandidate) {
+ const MCConstantExpr *Dummy = MCConstantExpr::create(0, Ctx);
+ Fixups.push_back(MCFixup::create(
+ 0, Dummy, MCFixupKind(LoongArch::fixup_loongarch_relax), MI.getLoc()));
+ }
+
return 0;
}
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp
index 993111552a31..82c992b1cc8c 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp
@@ -25,9 +25,10 @@ using namespace llvm;
#define DEBUG_TYPE "loongarch-mcexpr"
-const LoongArchMCExpr *
-LoongArchMCExpr::create(const MCExpr *Expr, VariantKind Kind, MCContext &Ctx) {
- return new (Ctx) LoongArchMCExpr(Expr, Kind);
+const LoongArchMCExpr *LoongArchMCExpr::create(const MCExpr *Expr,
+ VariantKind Kind, MCContext &Ctx,
+ bool Hint) {
+ return new (Ctx) LoongArchMCExpr(Expr, Kind, Hint);
}
void LoongArchMCExpr::printImpl(raw_ostream &OS, const MCAsmInfo *MAI) const {
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h
index 0945cf82db86..93251f824103 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h
@@ -67,16 +67,18 @@ public:
private:
const MCExpr *Expr;
const VariantKind Kind;
+ const bool RelaxHint;
- explicit LoongArchMCExpr(const MCExpr *Expr, VariantKind Kind)
- : Expr(Expr), Kind(Kind) {}
+ explicit LoongArchMCExpr(const MCExpr *Expr, VariantKind Kind, bool Hint)
+ : Expr(Expr), Kind(Kind), RelaxHint(Hint) {}
public:
static const LoongArchMCExpr *create(const MCExpr *Expr, VariantKind Kind,
- MCContext &Ctx);
+ MCContext &Ctx, bool Hint = false);
VariantKind getKind() const { return Kind; }
const MCExpr *getSubExpr() const { return Expr; }
+ bool getRelaxHint() const { return RelaxHint; }
void printImpl(raw_ostream &OS, const MCAsmInfo *MAI) const override;
bool evaluateAsRelocatableImpl(MCValue &Res, const MCAsmLayout *Layout,
diff --git a/llvm/test/MC/LoongArch/Macros/macros-la.s b/llvm/test/MC/LoongArch/Macros/macros-la.s
index 924e4326b8e5..1a1d12d7d7df 100644
--- a/llvm/test/MC/LoongArch/Macros/macros-la.s
+++ b/llvm/test/MC/LoongArch/Macros/macros-la.s
@@ -1,66 +1,128 @@
# RUN: llvm-mc --triple=loongarch64 %s | FileCheck %s
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o %t
+# RUN: llvm-readobj -r %t | FileCheck %s --check-prefix=RELOC
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.relax
+# RUN: llvm-readobj -r %t.relax | FileCheck %s --check-prefixes=RELOC,RELAX
+
+# RELOC: Relocations [
+# RELOC-NEXT: Section ({{.*}}) .rela.text {
la.abs $a0, sym_abs
# CHECK: lu12i.w $a0, %abs_hi20(sym_abs)
# CHECK-NEXT: ori $a0, $a0, %abs_lo12(sym_abs)
# CHECK-NEXT: lu32i.d $a0, %abs64_lo20(sym_abs)
# CHECK-NEXT: lu52i.d $a0, $a0, %abs64_hi12(sym_abs)
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_ABS_HI20 sym_abs 0x0
+# RELOC-NEXT: R_LARCH_ABS_LO12 sym_abs 0x0
+# RELOC-NEXT: R_LARCH_ABS64_LO20 sym_abs 0x0
+# RELOC-NEXT: R_LARCH_ABS64_HI12 sym_abs 0x0
la.pcrel $a0, sym_pcrel
-# CHECK: pcalau12i $a0, %pc_hi20(sym_pcrel)
+# CHECK-NEXT: pcalau12i $a0, %pc_hi20(sym_pcrel)
# CHECK-NEXT: addi.d $a0, $a0, %pc_lo12(sym_pcrel)
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_PCALA_HI20 sym_pcrel 0x0
+# RELAX-NEXT: R_LARCH_RELAX - 0x0
+# RELOC-NEXT: R_LARCH_PCALA_LO12 sym_pcrel 0x0
+# RELAX-NEXT: R_LARCH_RELAX - 0x0
la.pcrel $a0, $a1, sym_pcrel_large
-# CHECK: pcalau12i $a0, %pc_hi20(sym_pcrel_large)
+# CHECK-NEXT: pcalau12i $a0, %pc_hi20(sym_pcrel_large)
# CHECK-NEXT: addi.d $a1, $zero, %pc_lo12(sym_pcrel_large)
# CHECK-NEXT: lu32i.d $a1, %pc64_lo20(sym_pcrel_large)
# CHECK-NEXT: lu52i.d $a1, $a1, %pc64_hi12(sym_pcrel_large)
# CHECK-NEXT: add.d $a0, $a0, $a1
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_PCALA_HI20 sym_pcrel_large 0x0
+# RELOC-NEXT: R_LARCH_PCALA_LO12 sym_pcrel_large 0x0
+# RELOC-NEXT: R_LARCH_PCALA64_LO20 sym_pcrel_large 0x0
+# RELOC-NEXT: R_LARCH_PCALA64_HI12 sym_pcrel_large 0x0
la.got $a0, sym_got
-# CHECK: pcalau12i $a0, %got_pc_hi20(sym_got)
+# CHECK-NEXT: pcalau12i $a0, %got_pc_hi20(sym_got)
# CHECK-NEXT: ld.d $a0, $a0, %got_pc_lo12(sym_got)
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_GOT_PC_HI20 sym_got 0x0
+# RELAX-NEXT: R_LARCH_RELAX - 0x0
+# RELOC-NEXT: R_LARCH_GOT_PC_LO12 sym_got 0x0
+# RELAX-NEXT: R_LARCH_RELAX - 0x0
la.got $a0, $a1, sym_got_large
-# CHECK: pcalau12i $a0, %got_pc_hi20(sym_got_large)
+# CHECK-NEXT: pcalau12i $a0, %got_pc_hi20(sym_got_large)
# CHECK-NEXT: addi.d $a1, $zero, %got_pc_lo12(sym_got_large)
# CHECK-NEXT: lu32i.d $a1, %got64_pc_lo20(sym_got_large)
# CHECK-NEXT: lu52i.d $a1, $a1, %got64_pc_hi12(sym_got_large)
# CHECK-NEXT: ldx.d $a0, $a0, $a1
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_GOT_PC_HI20 sym_got_large 0x0
+# RELOC-NEXT: R_LARCH_GOT_PC_LO12 sym_got_large 0x0
+# RELOC-NEXT: R_LARCH_GOT64_PC_LO20 sym_got_large 0x0
+# RELOC-NEXT: R_LARCH_GOT64_PC_HI12 sym_got_large 0x0
la.tls.le $a0, sym_le
-# CHECK: lu12i.w $a0, %le_hi20(sym_le)
+# CHECK-NEXT: lu12i.w $a0, %le_hi20(sym_le)
# CHECK-NEXT: ori $a0, $a0, %le_lo12(sym_le)
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_TLS_LE_HI20 sym_le 0x0
+# RELOC-NEXT: R_LARCH_TLS_LE_LO12 sym_le 0x0
la.tls.ie $a0, sym_ie
-# CHECK: pcalau12i $a0, %ie_pc_hi20(sym_ie)
+# CHECK-NEXT: pcalau12i $a0, %ie_pc_hi20(sym_ie)
# CHECK-NEXT: ld.d $a0, $a0, %ie_pc_lo12(sym_ie)
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_TLS_IE_PC_HI20 sym_ie 0x0
+# RELOC-NEXT: R_LARCH_TLS_IE_PC_LO12 sym_ie 0x0
la.tls.ie $a0, $a1, sym_ie_large
-# CHECK: pcalau12i $a0, %ie_pc_hi20(sym_ie_large)
+# CHECK-NEXT: pcalau12i $a0, %ie_pc_hi20(sym_ie_large)
# CHECK-NEXT: addi.d $a1, $zero, %ie_pc_lo12(sym_ie_large)
# CHECK-NEXT: lu32i.d $a1, %ie64_pc_lo20(sym_ie_large)
# CHECK-NEXT: lu52i.d $a1, $a1, %ie64_pc_hi12(sym_ie_large)
# CHECK-NEXT: ldx.d $a0, $a0, $a1
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_TLS_IE_PC_HI20 sym_ie_large 0x0
+# RELOC-NEXT: R_LARCH_TLS_IE_PC_LO12 sym_ie_large 0x0
+# RELOC-NEXT: R_LARCH_TLS_IE64_PC_LO20 sym_ie_large 0x0
+# RELOC-NEXT: R_LARCH_TLS_IE64_PC_HI12 sym_ie_large 0x0
la.tls.ld $a0, sym_ld
-# CHECK: pcalau12i $a0, %ld_pc_hi20(sym_ld)
+# CHECK-NEXT: pcalau12i $a0, %ld_pc_hi20(sym_ld)
# CHECK-NEXT: addi.d $a0, $a0, %got_pc_lo12(sym_ld)
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_TLS_LD_PC_HI20 sym_ld 0x0
+# RELOC-NEXT: R_LARCH_GOT_PC_LO12 sym_ld 0x0
la.tls.ld $a0, $a1, sym_ld_large
-# CHECK: pcalau12i $a0, %ld_pc_hi20(sym_ld_large)
+# CHECK-NEXT: pcalau12i $a0, %ld_pc_hi20(sym_ld_large)
# CHECK-NEXT: addi.d $a1, $zero, %got_pc_lo12(sym_ld_large)
# CHECK-NEXT: lu32i.d $a1, %got64_pc_lo20(sym_ld_large)
# CHECK-NEXT: lu52i.d $a1, $a1, %got64_pc_hi12(sym_ld_large)
# CHECK-NEXT: add.d $a0, $a0, $a1
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_TLS_LD_PC_HI20 sym_ld_large 0x0
+# RELOC-NEXT: R_LARCH_GOT_PC_LO12 sym_ld_large 0x0
+# RELOC-NEXT: R_LARCH_GOT64_PC_LO20 sym_ld_large 0x0
+# RELOC-NEXT: R_LARCH_GOT64_PC_HI12 sym_ld_large 0x0
la.tls.gd $a0, sym_gd
-# CHECK: pcalau12i $a0, %gd_pc_hi20(sym_gd)
+# CHECK-NEXT: pcalau12i $a0, %gd_pc_hi20(sym_gd)
# CHECK-NEXT: addi.d $a0, $a0, %got_pc_lo12(sym_gd)
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_TLS_GD_PC_HI20 sym_gd 0x0
+# RELOC-NEXT: R_LARCH_GOT_PC_LO12 sym_gd 0x0
la.tls.gd $a0, $a1, sym_gd_large
-# CHECK: pcalau12i $a0, %gd_pc_hi20(sym_gd_large)
+# CHECK-NEXT: pcalau12i $a0, %gd_pc_hi20(sym_gd_large)
# CHECK-NEXT: addi.d $a1, $zero, %got_pc_lo12(sym_gd_large)
# CHECK-NEXT: lu32i.d $a1, %got64_pc_lo20(sym_gd_large)
# CHECK-NEXT: lu52i.d $a1, $a1, %got64_pc_hi12(sym_gd_large)
# CHECK-NEXT: add.d $a0, $a0, $a1
+# CHECK-EMPTY:
+# RELOC-NEXT: R_LARCH_TLS_GD_PC_HI20 sym_gd_large 0x0
+# RELOC-NEXT: R_LARCH_GOT_PC_LO12 sym_gd_large 0x0
+# RELOC-NEXT: R_LARCH_GOT64_PC_LO20 sym_gd_large 0x0
+# RELOC-NEXT: R_LARCH_GOT64_PC_HI12 sym_gd_large 0x0
+
+# RELOC-NEXT: }
+# RELOC-NEXT: ]
diff --git a/llvm/test/MC/LoongArch/Misc/subsection.s b/llvm/test/MC/LoongArch/Misc/subsection.s
index 0bd22b474536..566a2408d691 100644
--- a/llvm/test/MC/LoongArch/Misc/subsection.s
+++ b/llvm/test/MC/LoongArch/Misc/subsection.s
@@ -1,5 +1,5 @@
# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o /dev/null 2>&1 | FileCheck %s --check-prefixes=ERR,NORELAX --implicit-check-not=error:
-## TODO: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o /dev/null 2>&1 | FileCheck %s --check-prefixes=ERR,RELAX --implicit-check-not=error:
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o /dev/null 2>&1 | FileCheck %s --check-prefixes=ERR,RELAX --implicit-check-not=error:
a:
nop
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
index 532eb4e0561a..c4454f5bb98d 100644
--- a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -18,7 +18,9 @@
# RELAX: Relocations [
# RELAX-NEXT: Section ({{.*}}) .rela.text {
# RELAX-NEXT: 0x10 R_LARCH_PCALA_HI20 .L1 0x0
+# RELAX-NEXT: 0x10 R_LARCH_RELAX - 0x0
# RELAX-NEXT: 0x14 R_LARCH_PCALA_LO12 .L1 0x0
+# RELAX-NEXT: 0x14 R_LARCH_RELAX - 0x0
# RELAX-NEXT: }
# RELAX-NEXT: Section ({{.*}}) .rela.data {
# RELAX-NEXT: 0xF R_LARCH_ADD8 .L3 0x0
@@ -29,13 +31,21 @@
# RELAX-NEXT: 0x12 R_LARCH_SUB32 .L2 0x0
# RELAX-NEXT: 0x16 R_LARCH_ADD64 .L3 0x0
# RELAX-NEXT: 0x16 R_LARCH_SUB64 .L2 0x0
+# RELAX-NEXT: 0x1E R_LARCH_ADD8 .L4 0x0
+# RELAX-NEXT: 0x1E R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT: 0x1F R_LARCH_ADD16 .L4 0x0
+# RELAX-NEXT: 0x1F R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT: 0x21 R_LARCH_ADD32 .L4 0x0
+# RELAX-NEXT: 0x21 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT: 0x25 R_LARCH_ADD64 .L4 0x0
+# RELAX-NEXT: 0x25 R_LARCH_SUB64 .L3 0x0
# RELAX-NEXT: }
# RELAX-NEXT: ]
# RELAX: Hex dump of section '.data':
# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000000
-# RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000808
-# RELAX-NEXT: 0x00000020 00080000 00080000 00000000 00
+# RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000000
+# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00
.text
.L1:
@@ -60,8 +70,6 @@
.short .L3 - .L2
.word .L3 - .L2
.dword .L3 - .L2
-## TODO
-## With relaxation, emit relocs because la.pcrel is a linker-relaxable inst.
.byte .L4 - .L3
.short .L4 - .L3
.word .L4 - .L3
--
2.20.1

View File

@ -0,0 +1,123 @@
From be6e5c566f49bee5efe3d710bdd321e15d8d95ea Mon Sep 17 00:00:00 2001
From: Jinyang He <hejinyang@loongson.cn>
Date: Thu, 14 Mar 2024 12:10:50 +0800
Subject: [PATCH 05/14] [MC][LoongArch] Add AlignFragment size if layout is
available and not need insert nops (#76552)
Due to delayed decision for ADD/SUB relocations, RISCV and LoongArch may
go slow fragment walk path with available layout. When RISCV (or
LoongArch in the future) don't need insert nops, that means relax is
disabled. With available layout and not needing insert nops, the size of
AlignFragment should be a constant. So we can add it to Displacement for
folding A-B.
(cherry picked from commit 0731567a31e4ade97c27801045156a88c4589704)
Change-Id: I554d6766bd7f688204e956e4a6431574b4c511c9
---
llvm/lib/MC/MCExpr.cpp | 6 +++++
llvm/test/MC/LoongArch/Misc/cfi-advance.s | 27 +++++++++++++++++++
.../MC/LoongArch/Relocations/relax-addsub.s | 17 +++---------
3 files changed, 37 insertions(+), 13 deletions(-)
create mode 100644 llvm/test/MC/LoongArch/Misc/cfi-advance.s
diff --git a/llvm/lib/MC/MCExpr.cpp b/llvm/lib/MC/MCExpr.cpp
index 5a6596f93824..a561fed11179 100644
--- a/llvm/lib/MC/MCExpr.cpp
+++ b/llvm/lib/MC/MCExpr.cpp
@@ -707,8 +707,14 @@ static void AttemptToFoldSymbolOffsetDifference(
}
int64_t Num;
+ unsigned Count;
if (DF) {
Displacement += DF->getContents().size();
+ } else if (auto *AF = dyn_cast<MCAlignFragment>(FI);
+ AF && Layout &&
+ !Asm->getBackend().shouldInsertExtraNopBytesForCodeAlign(
+ *AF, Count)) {
+ Displacement += Asm->computeFragmentSize(*Layout, *AF);
} else if (auto *FF = dyn_cast<MCFillFragment>(FI);
FF && FF->getNumValues().evaluateAsAbsolute(Num)) {
Displacement += Num * FF->getValueSize();
diff --git a/llvm/test/MC/LoongArch/Misc/cfi-advance.s b/llvm/test/MC/LoongArch/Misc/cfi-advance.s
new file mode 100644
index 000000000000..662c43e6bcea
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Misc/cfi-advance.s
@@ -0,0 +1,27 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 -mattr=-relax %s -o %t.o
+# RUN: llvm-readobj -r %t.o | FileCheck --check-prefix=RELOC %s
+# RUN: llvm-dwarfdump --debug-frame %t.o | FileCheck --check-prefix=DWARFDUMP %s
+
+# RELOC: Relocations [
+# RELOC-NEXT: .rela.eh_frame {
+# RELOC-NEXT: 0x1C R_LARCH_32_PCREL .text 0x0
+# RELOC-NEXT: }
+# RELOC-NEXT: ]
+# DWARFDUMP: DW_CFA_advance_loc: 4
+# DWARFDUMP-NEXT: DW_CFA_def_cfa_offset: +8
+# DWARFDUMP-NEXT: DW_CFA_advance_loc: 8
+# DWARFDUMP-NEXT: DW_CFA_def_cfa_offset: +8
+
+ .text
+ .globl test
+ .p2align 2
+ .type test,@function
+test:
+ .cfi_startproc
+ nop
+ .cfi_def_cfa_offset 8
+ .p2align 3
+ nop
+ .cfi_def_cfa_offset 8
+ nop
+ .cfi_endproc
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
index c4454f5bb98d..14922657ae89 100644
--- a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -23,14 +23,6 @@
# RELAX-NEXT: 0x14 R_LARCH_RELAX - 0x0
# RELAX-NEXT: }
# RELAX-NEXT: Section ({{.*}}) .rela.data {
-# RELAX-NEXT: 0xF R_LARCH_ADD8 .L3 0x0
-# RELAX-NEXT: 0xF R_LARCH_SUB8 .L2 0x0
-# RELAX-NEXT: 0x10 R_LARCH_ADD16 .L3 0x0
-# RELAX-NEXT: 0x10 R_LARCH_SUB16 .L2 0x0
-# RELAX-NEXT: 0x12 R_LARCH_ADD32 .L3 0x0
-# RELAX-NEXT: 0x12 R_LARCH_SUB32 .L2 0x0
-# RELAX-NEXT: 0x16 R_LARCH_ADD64 .L3 0x0
-# RELAX-NEXT: 0x16 R_LARCH_SUB64 .L2 0x0
# RELAX-NEXT: 0x1E R_LARCH_ADD8 .L4 0x0
# RELAX-NEXT: 0x1E R_LARCH_SUB8 .L3 0x0
# RELAX-NEXT: 0x1F R_LARCH_ADD16 .L4 0x0
@@ -43,8 +35,8 @@
# RELAX-NEXT: ]
# RELAX: Hex dump of section '.data':
-# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000000
-# RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000000
+# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 0000000c
+# RELAX-NEXT: 0x00000010 0c000c00 00000c00 00000000 00000000
# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00
.text
@@ -63,13 +55,12 @@
.short .L2 - .L1
.word .L2 - .L1
.dword .L2 - .L1
-## With relaxation, emit relocs because of the .align making the diff variable.
-## TODO Handle alignment directive. Why they emit relocs now? They returns
-## without folding symbols offset in AttemptToFoldSymbolOffsetDifference().
+## TODO Handle alignment directive.
.byte .L3 - .L2
.short .L3 - .L2
.word .L3 - .L2
.dword .L3 - .L2
+## With relaxation, emit relocs because the la.pcrel makes the diff variable.
.byte .L4 - .L3
.short .L4 - .L3
.word .L4 - .L3
--
2.20.1

View File

@ -0,0 +1,633 @@
From 8d7b71890179d32474b3a1a1c627481bd5a2327d Mon Sep 17 00:00:00 2001
From: zhanglimin <zhanglimin@loongson.cn>
Date: Fri, 15 Mar 2024 14:39:48 +0800
Subject: [PATCH 06/14] [LoongArch][RISCV] Support
R_LARCH_{ADD,SUB}_ULEB128/R_RISCV_{SET,SUB}_ULEB128 for .uleb128 directives
This patch is originally from three upstream commits:
1, R_LARCH_{ADD,SUB}_ULEB128 are originally landed from b57159cb(#76433).
2, R_RISCV_{SET,SUB}_ULEB128 are originally supported from 1df5ea29. Among it, we change
the default behaviour of `-riscv-uleb128-reloc` to not produce uleb128 reloc, in order
to avoid any other side-effects due to the updated implementation of `MCAssembler::relaxLEB()`
function. And at the same time, we ensure that this patch can't introduce new default traits
(such as the generation for uleb128 reloc) on RISCV in this version.
3, Fix invalid-sleb.s in original commit d7398a35.
Change-Id: Ie687b7d8483c76cf647141162641db1a9d819a04
---
.../llvm/BinaryFormat/ELFRelocs/RISCV.def | 2 +
llvm/include/llvm/MC/MCAsmBackend.h | 8 +++
llvm/include/llvm/MC/MCFixup.h | 1 +
llvm/include/llvm/MC/MCFragment.h | 9 ++-
llvm/lib/MC/MCAsmBackend.cpp | 1 +
llvm/lib/MC/MCAssembler.cpp | 39 ++++++++--
.../MCTargetDesc/LoongArchAsmBackend.cpp | 69 ++++++++++++++----
.../MCTargetDesc/LoongArchAsmBackend.h | 3 +
.../RISCV/MCTargetDesc/RISCVAsmBackend.cpp | 27 +++++++
.../RISCV/MCTargetDesc/RISCVAsmBackend.h | 2 +
llvm/test/MC/ELF/RISCV/gen-dwarf.s | 5 +-
llvm/test/MC/LoongArch/Relocations/leb128.s | 72 +++++++++++++++++++
.../MC/LoongArch/Relocations/relax-addsub.s | 57 +++++++++++----
llvm/test/MC/X86/invalid-sleb.s | 5 --
14 files changed, 252 insertions(+), 48 deletions(-)
create mode 100644 llvm/test/MC/LoongArch/Relocations/leb128.s
delete mode 100644 llvm/test/MC/X86/invalid-sleb.s
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/RISCV.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/RISCV.def
index 9a126df01531..c7fd6490041c 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/RISCV.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/RISCV.def
@@ -55,3 +55,5 @@ ELF_RELOC(R_RISCV_SET32, 56)
ELF_RELOC(R_RISCV_32_PCREL, 57)
ELF_RELOC(R_RISCV_IRELATIVE, 58)
ELF_RELOC(R_RISCV_PLT32, 59)
+ELF_RELOC(R_RISCV_SET_ULEB128, 60)
+ELF_RELOC(R_RISCV_SUB_ULEB128, 61)
diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index 5e08fb41679b..968a767b17f8 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -21,6 +21,7 @@ class MCAlignFragment;
class MCDwarfCallFrameFragment;
class MCDwarfLineAddrFragment;
class MCFragment;
+class MCLEBFragment;
class MCRelaxableFragment;
class MCSymbol;
class MCAsmLayout;
@@ -194,6 +195,13 @@ public:
return false;
}
+ // Defined by linker relaxation targets to possibly emit LEB128 relocations
+ // and set Value at the relocated location.
+ virtual std::pair<bool, bool>
+ relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout, int64_t &Value) const {
+ return std::make_pair(false, false);
+ }
+
/// @}
/// Returns the minimum size of a nop in bytes on this target. The assembler
diff --git a/llvm/include/llvm/MC/MCFixup.h b/llvm/include/llvm/MC/MCFixup.h
index 069ca058310f..7f48a90cb1ec 100644
--- a/llvm/include/llvm/MC/MCFixup.h
+++ b/llvm/include/llvm/MC/MCFixup.h
@@ -25,6 +25,7 @@ enum MCFixupKind {
FK_Data_4, ///< A four-byte fixup.
FK_Data_8, ///< A eight-byte fixup.
FK_Data_6b, ///< A six-bits fixup.
+ FK_Data_leb128, ///< A leb128 fixup.
FK_PCRel_1, ///< A one-byte pc relative fixup.
FK_PCRel_2, ///< A two-byte pc relative fixup.
FK_PCRel_4, ///< A four-byte pc relative fixup.
diff --git a/llvm/include/llvm/MC/MCFragment.h b/llvm/include/llvm/MC/MCFragment.h
index 7be4792a4521..e965732010fe 100644
--- a/llvm/include/llvm/MC/MCFragment.h
+++ b/llvm/include/llvm/MC/MCFragment.h
@@ -428,7 +428,7 @@ public:
}
};
-class MCLEBFragment : public MCFragment {
+class MCLEBFragment final : public MCEncodedFragmentWithFixups<10, 1> {
/// True if this is a sleb128, false if uleb128.
bool IsSigned;
@@ -439,17 +439,16 @@ class MCLEBFragment : public MCFragment {
public:
MCLEBFragment(const MCExpr &Value_, bool IsSigned_, MCSection *Sec = nullptr)
- : MCFragment(FT_LEB, false, Sec), IsSigned(IsSigned_), Value(&Value_) {
+ : MCEncodedFragmentWithFixups<10, 1>(FT_LEB, false, Sec),
+ IsSigned(IsSigned_), Value(&Value_) {
Contents.push_back(0);
}
const MCExpr &getValue() const { return *Value; }
+ void setValue(const MCExpr *Expr) { Value = Expr; }
bool isSigned() const { return IsSigned; }
- SmallString<8> &getContents() { return Contents; }
- const SmallString<8> &getContents() const { return Contents; }
-
/// @}
static bool classof(const MCFragment *F) {
diff --git a/llvm/lib/MC/MCAsmBackend.cpp b/llvm/lib/MC/MCAsmBackend.cpp
index 64bbc63719c7..2eef7d363fe7 100644
--- a/llvm/lib/MC/MCAsmBackend.cpp
+++ b/llvm/lib/MC/MCAsmBackend.cpp
@@ -89,6 +89,7 @@ const MCFixupKindInfo &MCAsmBackend::getFixupKindInfo(MCFixupKind Kind) const {
{"FK_Data_4", 0, 32, 0},
{"FK_Data_8", 0, 64, 0},
{"FK_Data_6b", 0, 6, 0},
+ {"FK_Data_leb128", 0, 0, 0},
{"FK_PCRel_1", 0, 8, MCFixupKindInfo::FKF_IsPCRel},
{"FK_PCRel_2", 0, 16, MCFixupKindInfo::FKF_IsPCRel},
{"FK_PCRel_4", 0, 32, MCFixupKindInfo::FKF_IsPCRel},
diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp
index 55ed1a285cd7..86c798ec9e27 100644
--- a/llvm/lib/MC/MCAssembler.cpp
+++ b/llvm/lib/MC/MCAssembler.cpp
@@ -918,6 +918,12 @@ void MCAssembler::layout(MCAsmLayout &Layout) {
Contents = DF.getContents();
break;
}
+ case MCFragment::FT_LEB: {
+ auto &LF = cast<MCLEBFragment>(Frag);
+ Fixups = LF.getFixups();
+ Contents = LF.getContents();
+ break;
+ }
case MCFragment::FT_PseudoProbe: {
MCPseudoProbeAddrFragment &PF = cast<MCPseudoProbeAddrFragment>(Frag);
Fixups = PF.getFixups();
@@ -1006,12 +1012,31 @@ bool MCAssembler::relaxInstruction(MCAsmLayout &Layout,
}
bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
- uint64_t OldSize = LF.getContents().size();
+ const unsigned OldSize = static_cast<unsigned>(LF.getContents().size());
+ unsigned PadTo = OldSize;
int64_t Value;
- bool Abs = LF.getValue().evaluateKnownAbsolute(Value, Layout);
- if (!Abs)
- report_fatal_error("sleb128 and uleb128 expressions must be absolute");
- SmallString<8> &Data = LF.getContents();
+ SmallVectorImpl<char> &Data = LF.getContents();
+ LF.getFixups().clear();
+ // Use evaluateKnownAbsolute for Mach-O as a hack: .subsections_via_symbols
+ // requires that .uleb128 A-B is foldable where A and B reside in different
+ // fragments. This is used by __gcc_except_table.
+ bool Abs = getSubsectionsViaSymbols()
+ ? LF.getValue().evaluateKnownAbsolute(Value, Layout)
+ : LF.getValue().evaluateAsAbsolute(Value, Layout);
+ if (!Abs) {
+ bool Relaxed, UseZeroPad;
+ std::tie(Relaxed, UseZeroPad) = getBackend().relaxLEB128(LF, Layout, Value);
+ if (!Relaxed) {
+ getContext().reportError(LF.getValue().getLoc(),
+ Twine(LF.isSigned() ? ".s" : ".u") +
+ "leb128 expression is not absolute");
+ LF.setValue(MCConstantExpr::create(0, Context));
+ }
+ uint8_t Tmp[10]; // maximum size: ceil(64/7)
+ PadTo = std::max(PadTo, encodeULEB128(uint64_t(Value), Tmp));
+ if (UseZeroPad)
+ Value = 0;
+ }
Data.clear();
raw_svector_ostream OSE(Data);
// The compiler can generate EH table assembly that is impossible to assemble
@@ -1019,9 +1044,9 @@ bool MCAssembler::relaxLEB(MCAsmLayout &Layout, MCLEBFragment &LF) {
// to a later alignment fragment. To accommodate such tables, relaxation can
// only increase an LEB fragment size here, not decrease it. See PR35809.
if (LF.isSigned())
- encodeSLEB128(Value, OSE, OldSize);
+ encodeSLEB128(Value, OSE, PadTo);
else
- encodeULEB128(Value, OSE, OldSize);
+ encodeULEB128(Value, OSE, PadTo);
return OldSize != LF.getContents().size();
}
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index 1ed047a8e632..9227d4d6afed 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -92,6 +92,7 @@ static uint64_t adjustFixupValue(const MCFixup &Fixup, uint64_t Value,
case FK_Data_2:
case FK_Data_4:
case FK_Data_8:
+ case FK_Data_leb128:
return Value;
case LoongArch::fixup_loongarch_b16: {
if (!isInt<18>(Value))
@@ -129,6 +130,15 @@ static uint64_t adjustFixupValue(const MCFixup &Fixup, uint64_t Value,
}
}
+static void fixupLeb128(MCContext &Ctx, const MCFixup &Fixup,
+ MutableArrayRef<char> Data, uint64_t Value) {
+ unsigned I;
+ for (I = 0; I != Data.size() && Value; ++I, Value >>= 7)
+ Data[I] |= uint8_t(Value & 0x7f);
+ if (Value)
+ Ctx.reportError(Fixup.getLoc(), "Invalid uleb128 value!");
+}
+
void LoongArchAsmBackend::applyFixup(const MCAssembler &Asm,
const MCFixup &Fixup,
const MCValue &Target,
@@ -144,6 +154,10 @@ void LoongArchAsmBackend::applyFixup(const MCAssembler &Asm,
MCFixupKindInfo Info = getFixupKindInfo(Kind);
MCContext &Ctx = Asm.getContext();
+ // Fixup leb128 separately.
+ if (Fixup.getTargetKind() == FK_Data_leb128)
+ return fixupLeb128(Ctx, Fixup, Data, Value);
+
// Apply any target-specific value adjustments.
Value = adjustFixupValue(Fixup, Value, Ctx);
@@ -173,6 +187,7 @@ bool LoongArchAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
case FK_Data_2:
case FK_Data_4:
case FK_Data_8:
+ case FK_Data_leb128:
return !Target.isAbsolute();
}
}
@@ -202,9 +217,24 @@ getRelocPairForSize(unsigned Size) {
return std::make_pair(
MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD64),
MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB64));
+ case 128:
+ return std::make_pair(
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_ADD_ULEB128),
+ MCFixupKind(FirstLiteralRelocationKind + ELF::R_LARCH_SUB_ULEB128));
}
}
+std::pair<bool, bool> LoongArchAsmBackend::relaxLEB128(MCLEBFragment &LF,
+ MCAsmLayout &Layout,
+ int64_t &Value) const {
+ const MCExpr &Expr = LF.getValue();
+ if (LF.isSigned() || !Expr.evaluateKnownAbsolute(Value, Layout))
+ return std::make_pair(false, false);
+ LF.getFixups().push_back(
+ MCFixup::create(0, &Expr, FK_Data_leb128, Expr.getLoc()));
+ return std::make_pair(true, true);
+}
+
bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const {
// We mostly follow binutils' convention here: align to 4-byte boundary with a
@@ -226,21 +256,27 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
uint64_t &FixedValue) const {
std::pair<MCFixupKind, MCFixupKind> FK;
uint64_t FixedValueA, FixedValueB;
- const MCSection &SecA = Target.getSymA()->getSymbol().getSection();
- const MCSection &SecB = Target.getSymB()->getSymbol().getSection();
-
- // We need record relocation if SecA != SecB. Usually SecB is same as the
- // section of Fixup, which will be record the relocation as PCRel. If SecB
- // is not same as the section of Fixup, it will report error. Just return
- // false and then this work can be finished by handleFixup.
- if (&SecA != &SecB)
- return false;
-
- // In SecA == SecB case. If the linker relaxation is enabled, we need record
- // the ADD, SUB relocations. Otherwise the FixedValue has already been
- // calculated out in evaluateFixup, return true and avoid record relocations.
- if (!STI.hasFeature(LoongArch::FeatureRelax))
- return true;
+ const MCSymbol &SA = Target.getSymA()->getSymbol();
+ const MCSymbol &SB = Target.getSymB()->getSymbol();
+
+ bool force = !SA.isInSection() || !SB.isInSection();
+ if (!force) {
+ const MCSection &SecA = SA.getSection();
+ const MCSection &SecB = SB.getSection();
+
+ // We need record relocation if SecA != SecB. Usually SecB is same as the
+ // section of Fixup, which will be record the relocation as PCRel. If SecB
+ // is not same as the section of Fixup, it will report error. Just return
+ // false and then this work can be finished by handleFixup.
+ if (&SecA != &SecB)
+ return false;
+
+ // In SecA == SecB case. If the linker relaxation is enabled, we need record
+ // the ADD, SUB relocations. Otherwise the FixedValue has already been calc-
+ // ulated out in evaluateFixup, return true and avoid record relocations.
+ if (!STI.hasFeature(LoongArch::FeatureRelax))
+ return true;
+ }
switch (Fixup.getKind()) {
case llvm::FK_Data_1:
@@ -255,6 +291,9 @@ bool LoongArchAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
case llvm::FK_Data_8:
FK = getRelocPairForSize(64);
break;
+ case llvm::FK_Data_leb128:
+ FK = getRelocPairForSize(128);
+ break;
default:
llvm_unreachable("unsupported fixup size");
}
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
index 20f25b5cf53b..49801e4fd81a 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
@@ -65,6 +65,9 @@ public:
void relaxInstruction(MCInst &Inst,
const MCSubtargetInfo &STI) const override {}
+ std::pair<bool, bool> relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
+ int64_t &Value) const override;
+
bool writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const override;
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
index 1b890fbe041a..5c651aa93225 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
@@ -19,6 +19,7 @@
#include "llvm/MC/MCObjectWriter.h"
#include "llvm/MC/MCSymbol.h"
#include "llvm/MC/MCValue.h"
+#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Endian.h"
#include "llvm/Support/EndianStream.h"
#include "llvm/Support/ErrorHandling.h"
@@ -27,6 +28,13 @@
using namespace llvm;
+// Temporary workaround for old linkers that do not support ULEB128 relocations,
+// which are abused by DWARF v5 DW_LLE_offset_pair/DW_RLE_offset_pair
+// implemented in Clang/LLVM.
+static cl::opt<bool> ULEB128Reloc(
+ "riscv-uleb128-reloc", cl::init(false), cl::Hidden,
+ cl::desc("Emit R_RISCV_SET_ULEB128/E_RISCV_SUB_ULEB128 if appropriate"));
+
std::optional<MCFixupKind> RISCVAsmBackend::getFixupKind(StringRef Name) const {
if (STI.getTargetTriple().isOSBinFormatELF()) {
unsigned Type;
@@ -126,6 +134,7 @@ bool RISCVAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
case FK_Data_2:
case FK_Data_4:
case FK_Data_8:
+ case FK_Data_leb128:
if (Target.isAbsolute())
return false;
break;
@@ -330,6 +339,19 @@ bool RISCVAsmBackend::relaxDwarfCFA(MCDwarfCallFrameFragment &DF,
return true;
}
+std::pair<bool, bool> RISCVAsmBackend::relaxLEB128(MCLEBFragment &LF,
+ MCAsmLayout &Layout,
+ int64_t &Value) const {
+ if (LF.isSigned())
+ return std::make_pair(false, false);
+ const MCExpr &Expr = LF.getValue();
+ if (ULEB128Reloc) {
+ LF.getFixups().push_back(
+ MCFixup::create(0, &Expr, FK_Data_leb128, Expr.getLoc()));
+ }
+ return std::make_pair(Expr.evaluateKnownAbsolute(Value, Layout), false);
+}
+
// Given a compressed control flow instruction this function returns
// the expanded instruction.
unsigned RISCVAsmBackend::getRelaxedOpcode(unsigned Op) const {
@@ -416,6 +438,7 @@ static uint64_t adjustFixupValue(const MCFixup &Fixup, uint64_t Value,
case FK_Data_4:
case FK_Data_8:
case FK_Data_6b:
+ case FK_Data_leb128:
return Value;
case RISCV::fixup_riscv_set_6b:
return Value & 0x03;
@@ -596,6 +619,10 @@ bool RISCVAsmBackend::handleAddSubRelocations(const MCAsmLayout &Layout,
TA = ELF::R_RISCV_ADD64;
TB = ELF::R_RISCV_SUB64;
break;
+ case llvm::FK_Data_leb128:
+ TA = ELF::R_RISCV_SET_ULEB128;
+ TB = ELF::R_RISCV_SUB_ULEB128;
+ break;
default:
llvm_unreachable("unsupported fixup size");
}
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
index 0ea1f32e8296..edefb171bcdc 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.h
@@ -99,6 +99,8 @@ public:
bool &WasRelaxed) const override;
bool relaxDwarfCFA(MCDwarfCallFrameFragment &DF, MCAsmLayout &Layout,
bool &WasRelaxed) const override;
+ std::pair<bool, bool> relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
+ int64_t &Value) const override;
bool writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const override;
diff --git a/llvm/test/MC/ELF/RISCV/gen-dwarf.s b/llvm/test/MC/ELF/RISCV/gen-dwarf.s
index 2235559d5f35..2a7dc777e70c 100644
--- a/llvm/test/MC/ELF/RISCV/gen-dwarf.s
+++ b/llvm/test/MC/ELF/RISCV/gen-dwarf.s
@@ -9,7 +9,7 @@
## emit special opcodes to make .debug_line smaller, but we don't do this for
## consistency.
-# RUN: llvm-mc -filetype=obj -triple=riscv64 -g -dwarf-version=5 -mattr=+relax < %s -o %t
+# RUN: llvm-mc -filetype=obj -triple=riscv64 -g -dwarf-version=5 -mattr=+relax -riscv-uleb128-reloc=1 < %s -o %t
# RUN: llvm-dwarfdump -eh-frame -debug-line -debug-rnglists -v %t | FileCheck %s
# RUN: llvm-readobj -r -x .eh_frame %t | FileCheck %s --check-prefix=RELOC
@@ -48,9 +48,10 @@
# RELOC-NEXT: 0x34 R_RISCV_32_PCREL <null> 0x0
# RELOC-NEXT: }
-## TODO A section needs two relocations.
# RELOC: Section ([[#]]) .rela.debug_rnglists {
# RELOC-NEXT: 0xD R_RISCV_64 .text.foo 0x0
+# RELOC-NEXT: 0x15 R_RISCV_SET_ULEB128 <null> 0x0
+# RELOC-NEXT: 0x15 R_RISCV_SUB_ULEB128 .text.foo 0x0
# RELOC-NEXT: 0x17 R_RISCV_64 .text.bar 0x0
# RELOC-NEXT: }
diff --git a/llvm/test/MC/LoongArch/Relocations/leb128.s b/llvm/test/MC/LoongArch/Relocations/leb128.s
new file mode 100644
index 000000000000..7a96ec551b76
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/leb128.s
@@ -0,0 +1,72 @@
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o %t
+# RUN: llvm-readobj -r -x .alloc_w %t | FileCheck --check-prefixes=CHECK,NORELAX %s
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.relax
+# RUN: llvm-readobj -r -x .alloc_w %t.relax | FileCheck --check-prefixes=CHECK,RELAX %s
+
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN: FileCheck %s --check-prefix=ERR
+# RUN: not llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax --defsym ERR=1 %s -o /dev/null 2>&1 | \
+# RUN: FileCheck %s --check-prefix=ERR
+
+# CHECK: Relocations [
+# CHECK-NEXT: .rela.alloc_w {
+# RELAX-NEXT: 0x0 R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT: 0x0 R_LARCH_SUB_ULEB128 w 0x0
+# RELAX-NEXT: 0x1 R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT: 0x1 R_LARCH_SUB_ULEB128 w1 0x0
+# CHECK-NEXT: 0x2 R_LARCH_PCALA_HI20 foo 0x0
+# RELAX-NEXT: 0x2 R_LARCH_RELAX - 0x0
+# CHECK-NEXT: 0x6 R_LARCH_PCALA_LO12 foo 0x0
+# RELAX-NEXT: 0x6 R_LARCH_RELAX - 0x0
+# RELAX-NEXT: 0xA R_LARCH_ADD_ULEB128 w2 0x0
+# RELAX-NEXT: 0xA R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT: 0xB R_LARCH_ADD_ULEB128 w2 0x78
+# RELAX-NEXT: 0xB R_LARCH_SUB_ULEB128 w1 0x0
+# RELAX-NEXT: 0xD R_LARCH_ADD_ULEB128 w1 0x0
+# RELAX-NEXT: 0xD R_LARCH_SUB_ULEB128 w2 0x0
+# RELAX-NEXT: 0x17 R_LARCH_ADD_ULEB128 w3 0x6F
+# RELAX-NEXT: 0x17 R_LARCH_SUB_ULEB128 w2 0x0
+# RELAX-NEXT: 0x18 R_LARCH_ADD_ULEB128 w3 0x71
+# RELAX-NEXT: 0x18 R_LARCH_SUB_ULEB128 w2 0x0
+# CHECK-NEXT: }
+# CHECK-NEXT: ]
+
+# CHECK: Hex dump of section '.alloc_w':
+# NORELAX-NEXT: 0x00000000 02080c00 001a8c01 c0020880 01f8ffff
+# NORELAX-NEXT: 0x00000010 ffffffff ffff017f 8101
+# RELAX-NEXT: 0x00000000 00000c00 001a8c01 c0020080 00808080
+# RELAX-NEXT: 0x00000010 80808080 80800000 8000
+
+.section .alloc_w,"ax",@progbits; w:
+.uleb128 w1-w # w1 is later defined in the same section
+.uleb128 w2-w1 # w1 and w2 are separated by a linker relaxable instruction
+w1:
+ la.pcrel $t0, foo
+w2:
+.uleb128 w2-w1 # 0x08
+.uleb128 w2-w1+120 # 0x0180
+.uleb128 -(w2-w1) # 0x01fffffffffffffffff8
+.uleb128 w3-w2+111 # 0x7f
+.uleb128 w3-w2+113 # 0x0181
+w3:
+
+.ifdef ERR
+# ERR: :[[#@LINE+1]]:16: error: .uleb128 expression is not absolute
+.uleb128 extern-w # extern is undefined
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 w-extern
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-w # x is later defined in another section
+
+.section .alloc_x,"aw",@progbits; x:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 y-x
+.section .alloc_y,"aw",@progbits; y:
+# ERR: :[[#@LINE+1]]:11: error: .uleb128 expression is not absolute
+.uleb128 x-y
+
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 extern
+# ERR: :[[#@LINE+1]]:10: error: .uleb128 expression is not absolute
+.uleb128 y
+.endif
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
index 14922657ae89..cd01332afd0b 100644
--- a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -8,12 +8,23 @@
# NORELAX-NEXT: 0x10 R_LARCH_PCALA_HI20 .text 0x0
# NORELAX-NEXT: 0x14 R_LARCH_PCALA_LO12 .text 0x0
# NORELAX-NEXT: }
+# NORELAX-NEXT: Section ({{.*}}) .rela.data {
+# NORELAX-NEXT: 0x30 R_LARCH_ADD8 foo 0x0
+# NORELAX-NEXT: 0x30 R_LARCH_SUB8 .text 0x10
+# NORELAX-NEXT: 0x31 R_LARCH_ADD16 foo 0x0
+# NORELAX-NEXT: 0x31 R_LARCH_SUB16 .text 0x10
+# NORELAX-NEXT: 0x33 R_LARCH_ADD32 foo 0x0
+# NORELAX-NEXT: 0x33 R_LARCH_SUB32 .text 0x10
+# NORELAX-NEXT: 0x37 R_LARCH_ADD64 foo 0x0
+# NORELAX-NEXT: 0x37 R_LARCH_SUB64 .text 0x10
+# NORELAX-NEXT: }
# NORELAX-NEXT: ]
# NORELAX: Hex dump of section '.data':
-# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 0000000c
-# NORELAX-NEXT: 0x00000010 0c000c00 00000c00 00000000 00000808
-# NORELAX-NEXT: 0x00000020 00080000 00080000 00000000 00
+# NORELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
+# NORELAX-NEXT: 0x00000010 0c0c000c 0000000c 00000000 0000000c
+# NORELAX-NEXT: 0x00000020 08080008 00000008 00000000 00000008
+# NORELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
# RELAX: Relocations [
# RELAX-NEXT: Section ({{.*}}) .rela.text {
@@ -23,21 +34,32 @@
# RELAX-NEXT: 0x14 R_LARCH_RELAX - 0x0
# RELAX-NEXT: }
# RELAX-NEXT: Section ({{.*}}) .rela.data {
-# RELAX-NEXT: 0x1E R_LARCH_ADD8 .L4 0x0
-# RELAX-NEXT: 0x1E R_LARCH_SUB8 .L3 0x0
-# RELAX-NEXT: 0x1F R_LARCH_ADD16 .L4 0x0
-# RELAX-NEXT: 0x1F R_LARCH_SUB16 .L3 0x0
-# RELAX-NEXT: 0x21 R_LARCH_ADD32 .L4 0x0
-# RELAX-NEXT: 0x21 R_LARCH_SUB32 .L3 0x0
-# RELAX-NEXT: 0x25 R_LARCH_ADD64 .L4 0x0
-# RELAX-NEXT: 0x25 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT: 0x20 R_LARCH_ADD8 .L4 0x0
+# RELAX-NEXT: 0x20 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT: 0x21 R_LARCH_ADD16 .L4 0x0
+# RELAX-NEXT: 0x21 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT: 0x23 R_LARCH_ADD32 .L4 0x0
+# RELAX-NEXT: 0x23 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT: 0x27 R_LARCH_ADD64 .L4 0x0
+# RELAX-NEXT: 0x27 R_LARCH_SUB64 .L3 0x0
+# RELAX-NEXT: 0x2F R_LARCH_ADD_ULEB128 .L4 0x0
+# RELAX-NEXT: 0x2F R_LARCH_SUB_ULEB128 .L3 0x0
+# RELAX-NEXT: 0x30 R_LARCH_ADD8 foo 0x0
+# RELAX-NEXT: 0x30 R_LARCH_SUB8 .L3 0x0
+# RELAX-NEXT: 0x31 R_LARCH_ADD16 foo 0x0
+# RELAX-NEXT: 0x31 R_LARCH_SUB16 .L3 0x0
+# RELAX-NEXT: 0x33 R_LARCH_ADD32 foo 0x0
+# RELAX-NEXT: 0x33 R_LARCH_SUB32 .L3 0x0
+# RELAX-NEXT: 0x37 R_LARCH_ADD64 foo 0x0
+# RELAX-NEXT: 0x37 R_LARCH_SUB64 .L3 0x0
# RELAX-NEXT: }
# RELAX-NEXT: ]
# RELAX: Hex dump of section '.data':
-# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 0000000c
-# RELAX-NEXT: 0x00000010 0c000c00 00000c00 00000000 00000000
-# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00
+# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
+# RELAX-NEXT: 0x00000010 0c0c000c 0000000c 00000000 0000000c
+# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00000000
+# RELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
.text
.L1:
@@ -55,13 +77,20 @@
.short .L2 - .L1
.word .L2 - .L1
.dword .L2 - .L1
+.uleb128 .L2 - .L1
## TODO Handle alignment directive.
.byte .L3 - .L2
.short .L3 - .L2
.word .L3 - .L2
.dword .L3 - .L2
+.uleb128 .L3 - .L2
## With relaxation, emit relocs because the la.pcrel makes the diff variable.
.byte .L4 - .L3
.short .L4 - .L3
.word .L4 - .L3
.dword .L4 - .L3
+.uleb128 .L4 - .L3
+.byte foo - .L3
+.short foo - .L3
+.word foo - .L3
+.dword foo - .L3
diff --git a/llvm/test/MC/X86/invalid-sleb.s b/llvm/test/MC/X86/invalid-sleb.s
deleted file mode 100644
index 7d7df351ce4e..000000000000
--- a/llvm/test/MC/X86/invalid-sleb.s
+++ /dev/null
@@ -1,5 +0,0 @@
-// RUN: not --crash llvm-mc -filetype=obj -triple x86_64-pc-linux %s -o %t 2>&1 | FileCheck %s
-
-// CHECK: sleb128 and uleb128 expressions must be absolute
-
- .sleb128 undefined
--
2.20.1

View File

@ -0,0 +1,376 @@
From 286c92a8e78c4b67368c2f47a8e73036fdacbae2 Mon Sep 17 00:00:00 2001
From: Jinyang He <hejinyang@loongson.cn>
Date: Tue, 16 Jan 2024 13:20:13 +0800
Subject: [PATCH 07/14] [LoongArch] Add relaxDwarfLineAddr and relaxDwarfCFA to
handle the mutable label diff in dwarfinfo (#77728)
When linker-relaxation is enabled, part of the label diff in dwarfinfo
cannot be computed before static link. Refer to RISCV, we add the
relaxDwarfLineAddr and relaxDwarfCFA to add relocations for these label
diffs. Calculate whether the label diff is mutable. For immutable label
diff, return false and do the other works by its parent function.
(cherry picked from commit ed7f4edc19ada006789318a0929b57d1b5a761bd)
Change-Id: Iae5bad958c6d1a71dac1672f5f03991eaeea6d22
---
llvm/lib/Object/RelocationResolver.cpp | 12 +-
.../MCTargetDesc/LoongArchAsmBackend.cpp | 129 ++++++++++++++++++
.../MCTargetDesc/LoongArchAsmBackend.h | 5 +
.../LoongArch/dwarf-loongarch-relocs.ll | 128 +++++++++++++++++
llvm/test/DebugInfo/LoongArch/lit.local.cfg | 2 +
5 files changed, 274 insertions(+), 2 deletions(-)
create mode 100644 llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll
create mode 100644 llvm/test/DebugInfo/LoongArch/lit.local.cfg
diff --git a/llvm/lib/Object/RelocationResolver.cpp b/llvm/lib/Object/RelocationResolver.cpp
index 03ac59289528..0e5036d7dfcc 100644
--- a/llvm/lib/Object/RelocationResolver.cpp
+++ b/llvm/lib/Object/RelocationResolver.cpp
@@ -539,6 +539,8 @@ static bool supportsLoongArch(uint64_t Type) {
case ELF::R_LARCH_32:
case ELF::R_LARCH_32_PCREL:
case ELF::R_LARCH_64:
+ case ELF::R_LARCH_ADD6:
+ case ELF::R_LARCH_SUB6:
case ELF::R_LARCH_ADD8:
case ELF::R_LARCH_SUB8:
case ELF::R_LARCH_ADD16:
@@ -564,6 +566,10 @@ static uint64_t resolveLoongArch(uint64_t Type, uint64_t Offset, uint64_t S,
return (S + Addend - Offset) & 0xFFFFFFFF;
case ELF::R_LARCH_64:
return S + Addend;
+ case ELF::R_LARCH_ADD6:
+ return (LocData & 0xC0) | ((LocData + S + Addend) & 0x3F);
+ case ELF::R_LARCH_SUB6:
+ return (LocData & 0xC0) | ((LocData - (S + Addend)) & 0x3F);
case ELF::R_LARCH_ADD8:
return (LocData + (S + Addend)) & 0xFF;
case ELF::R_LARCH_SUB8:
@@ -880,8 +886,10 @@ uint64_t resolveRelocation(RelocationResolver Resolver, const RelocationRef &R,
if (GetRelSectionType() == ELF::SHT_RELA) {
Addend = getELFAddend(R);
- // RISCV relocations use both LocData and Addend.
- if (Obj->getArch() != Triple::riscv32 &&
+ // LoongArch and RISCV relocations use both LocData and Addend.
+ if (Obj->getArch() != Triple::loongarch32 &&
+ Obj->getArch() != Triple::loongarch64 &&
+ Obj->getArch() != Triple::riscv32 &&
Obj->getArch() != Triple::riscv64)
LocData = 0;
}
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index 9227d4d6afed..8d82327b2e2b 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -12,6 +12,7 @@
#include "LoongArchAsmBackend.h"
#include "LoongArchFixupKinds.h"
+#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCAsmLayout.h"
#include "llvm/MC/MCAssembler.h"
#include "llvm/MC/MCContext.h"
@@ -19,6 +20,7 @@
#include "llvm/MC/MCValue.h"
#include "llvm/Support/Endian.h"
#include "llvm/Support/EndianStream.h"
+#include "llvm/Support/LEB128.h"
#define DEBUG_TYPE "loongarch-asmbackend"
@@ -235,6 +237,133 @@ std::pair<bool, bool> LoongArchAsmBackend::relaxLEB128(MCLEBFragment &LF,
return std::make_pair(true, true);
}
+bool LoongArchAsmBackend::relaxDwarfLineAddr(MCDwarfLineAddrFragment &DF,
+ MCAsmLayout &Layout,
+ bool &WasRelaxed) const {
+ MCContext &C = Layout.getAssembler().getContext();
+
+ int64_t LineDelta = DF.getLineDelta();
+ const MCExpr &AddrDelta = DF.getAddrDelta();
+ SmallVectorImpl<char> &Data = DF.getContents();
+ SmallVectorImpl<MCFixup> &Fixups = DF.getFixups();
+ size_t OldSize = Data.size();
+
+ int64_t Value;
+ if (AddrDelta.evaluateAsAbsolute(Value, Layout))
+ return false;
+ bool IsAbsolute = AddrDelta.evaluateKnownAbsolute(Value, Layout);
+ assert(IsAbsolute && "CFA with invalid expression");
+ (void)IsAbsolute;
+
+ Data.clear();
+ Fixups.clear();
+ raw_svector_ostream OS(Data);
+
+ // INT64_MAX is a signal that this is actually a DW_LNE_end_sequence.
+ if (LineDelta != INT64_MAX) {
+ OS << uint8_t(dwarf::DW_LNS_advance_line);
+ encodeSLEB128(LineDelta, OS);
+ }
+
+ unsigned Offset;
+ std::pair<MCFixupKind, MCFixupKind> FK;
+
+ // According to the DWARF specification, the `DW_LNS_fixed_advance_pc` opcode
+ // takes a single unsigned half (unencoded) operand. The maximum encodable
+ // value is therefore 65535. Set a conservative upper bound for relaxation.
+ if (Value > 60000) {
+ unsigned PtrSize = C.getAsmInfo()->getCodePointerSize();
+
+ OS << uint8_t(dwarf::DW_LNS_extended_op);
+ encodeULEB128(PtrSize + 1, OS);
+
+ OS << uint8_t(dwarf::DW_LNE_set_address);
+ Offset = OS.tell();
+ assert((PtrSize == 4 || PtrSize == 8) && "Unexpected pointer size");
+ FK = getRelocPairForSize(PtrSize == 4 ? 32 : 64);
+ OS.write_zeros(PtrSize);
+ } else {
+ OS << uint8_t(dwarf::DW_LNS_fixed_advance_pc);
+ Offset = OS.tell();
+ FK = getRelocPairForSize(16);
+ support::endian::write<uint16_t>(OS, 0, support::little);
+ }
+
+ const MCBinaryExpr &MBE = cast<MCBinaryExpr>(AddrDelta);
+ Fixups.push_back(MCFixup::create(Offset, MBE.getLHS(), std::get<0>(FK)));
+ Fixups.push_back(MCFixup::create(Offset, MBE.getRHS(), std::get<1>(FK)));
+
+ if (LineDelta == INT64_MAX) {
+ OS << uint8_t(dwarf::DW_LNS_extended_op);
+ OS << uint8_t(1);
+ OS << uint8_t(dwarf::DW_LNE_end_sequence);
+ } else {
+ OS << uint8_t(dwarf::DW_LNS_copy);
+ }
+
+ WasRelaxed = OldSize != Data.size();
+ return true;
+}
+
+bool LoongArchAsmBackend::relaxDwarfCFA(MCDwarfCallFrameFragment &DF,
+ MCAsmLayout &Layout,
+ bool &WasRelaxed) const {
+ const MCExpr &AddrDelta = DF.getAddrDelta();
+ SmallVectorImpl<char> &Data = DF.getContents();
+ SmallVectorImpl<MCFixup> &Fixups = DF.getFixups();
+ size_t OldSize = Data.size();
+
+ int64_t Value;
+ if (AddrDelta.evaluateAsAbsolute(Value, Layout))
+ return false;
+ bool IsAbsolute = AddrDelta.evaluateKnownAbsolute(Value, Layout);
+ assert(IsAbsolute && "CFA with invalid expression");
+ (void)IsAbsolute;
+
+ Data.clear();
+ Fixups.clear();
+ raw_svector_ostream OS(Data);
+
+ assert(
+ Layout.getAssembler().getContext().getAsmInfo()->getMinInstAlignment() ==
+ 1 &&
+ "expected 1-byte alignment");
+ if (Value == 0) {
+ WasRelaxed = OldSize != Data.size();
+ return true;
+ }
+
+ auto AddFixups = [&Fixups,
+ &AddrDelta](unsigned Offset,
+ std::pair<MCFixupKind, MCFixupKind> FK) {
+ const MCBinaryExpr &MBE = cast<MCBinaryExpr>(AddrDelta);
+ Fixups.push_back(MCFixup::create(Offset, MBE.getLHS(), std::get<0>(FK)));
+ Fixups.push_back(MCFixup::create(Offset, MBE.getRHS(), std::get<1>(FK)));
+ };
+
+ if (isUIntN(6, Value)) {
+ OS << uint8_t(dwarf::DW_CFA_advance_loc);
+ AddFixups(0, getRelocPairForSize(6));
+ } else if (isUInt<8>(Value)) {
+ OS << uint8_t(dwarf::DW_CFA_advance_loc1);
+ support::endian::write<uint8_t>(OS, 0, support::little);
+ AddFixups(1, getRelocPairForSize(8));
+ } else if (isUInt<16>(Value)) {
+ OS << uint8_t(dwarf::DW_CFA_advance_loc2);
+ support::endian::write<uint16_t>(OS, 0, support::little);
+ AddFixups(1, getRelocPairForSize(16));
+ } else if (isUInt<32>(Value)) {
+ OS << uint8_t(dwarf::DW_CFA_advance_loc4);
+ support::endian::write<uint32_t>(OS, 0, support::little);
+ AddFixups(1, getRelocPairForSize(32));
+ } else {
+ llvm_unreachable("unsupported CFA encoding");
+ }
+
+ WasRelaxed = OldSize != Data.size();
+ return true;
+}
+
bool LoongArchAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const {
// We mostly follow binutils' convention here: align to 4-byte boundary with a
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
index 49801e4fd81a..657f5ca5e731 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
@@ -68,6 +68,11 @@ public:
std::pair<bool, bool> relaxLEB128(MCLEBFragment &LF, MCAsmLayout &Layout,
int64_t &Value) const override;
+ bool relaxDwarfLineAddr(MCDwarfLineAddrFragment &DF, MCAsmLayout &Layout,
+ bool &WasRelaxed) const override;
+ bool relaxDwarfCFA(MCDwarfCallFrameFragment &DF, MCAsmLayout &Layout,
+ bool &WasRelaxed) const override;
+
bool writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const override;
diff --git a/llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll b/llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll
new file mode 100644
index 000000000000..e03b4c1d34de
--- /dev/null
+++ b/llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll
@@ -0,0 +1,128 @@
+; RUN: llc --filetype=obj --mtriple=loongarch64 --mattr=-relax %s -o %t.o
+; RUN: llvm-readobj -r %t.o | FileCheck --check-prefixes=RELOCS-BOTH,RELOCS-NORL %s
+; RUN: llvm-objdump --source %t.o | FileCheck --check-prefix=SOURCE %s
+; RUN: llvm-dwarfdump --debug-info --debug-line %t.o | FileCheck --check-prefix=DWARF %s
+
+; RUN: llc --filetype=obj --mtriple=loongarch64 --mattr=+relax %s -o %t.r.o
+; RUN: llvm-readobj -r %t.r.o | FileCheck --check-prefixes=RELOCS-BOTH,RELOCS-ENRL %s
+; RUN: llvm-objdump --source %t.r.o | FileCheck --check-prefix=SOURCE %s
+; RUN: llvm-dwarfdump --debug-info --debug-line %t.r.o | FileCheck --check-prefix=DWARF %s
+
+; RELOCS-BOTH: Relocations [
+; RELOCS-BOTH-NEXT: Section ({{.*}}) .rela.text {
+; RELOCS-BOTH-NEXT: 0x14 R_LARCH_PCALA_HI20 sym 0x0
+; RELOCS-ENRL-NEXT: 0x14 R_LARCH_RELAX - 0x0
+; RELOCS-BOTH-NEXT: 0x18 R_LARCH_PCALA_LO12 sym 0x0
+; RELOCS-ENRL-NEXT: 0x18 R_LARCH_RELAX - 0x0
+; RELOCS-BOTH-NEXT: }
+; RELOCS-BOTH: Section ({{.*}}) .rela.debug_frame {
+; RELOCS-NORL-NEXT: 0x1C R_LARCH_32 .debug_frame 0x0
+; RELOCS-NORL-NEXT: 0x20 R_LARCH_64 .text 0x0
+; RELOCS-ENRL-NEXT: 0x1C R_LARCH_32 <null> 0x0
+; RELOCS-ENRL-NEXT: 0x20 R_LARCH_64 <null> 0x0
+; RELOCS-ENRL-NEXT: 0x28 R_LARCH_ADD64 <null> 0x0
+; RELOCS-ENRL-NEXT: 0x28 R_LARCH_SUB64 <null> 0x0
+; RELOCS-ENRL-NEXT: 0x3F R_LARCH_ADD6 <null> 0x0
+; RELOCS-ENRL-NEXT: 0x3F R_LARCH_SUB6 <null> 0x0
+; RELOCS-BOTH-NEXT: }
+; RELOCS-BOTH: Section ({{.*}}) .rela.debug_line {
+; RELOCS-BOTH-NEXT: 0x22 R_LARCH_32 .debug_line_str 0x0
+; RELOCS-BOTH-NEXT: 0x31 R_LARCH_32 .debug_line_str 0x2
+; RELOCS-BOTH-NEXT: 0x46 R_LARCH_32 .debug_line_str 0x1B
+; RELOCS-NORL-NEXT: 0x4F R_LARCH_64 .text 0x0
+; RELOCS-ENRL-NEXT: 0x4F R_LARCH_64 <null> 0x0
+; RELOCS-ENRL-NEXT: 0x5F R_LARCH_ADD16 <null> 0x0
+; RELOCS-ENRL-NEXT: 0x5F R_LARCH_SUB16 <null> 0x0
+; RELOCS-BOTH-NEXT: }
+; RELOCS-BOTH-NEXT: ]
+
+; SOURCE: 0000000000000000 <foo>:
+; SOURCE: ; {
+; SOURCE: ; asm volatile(
+; SOURCE: ; return 0;
+
+; DWARF: DW_AT_producer ("clang")
+; DWARF: DW_AT_name ("dwarf-loongarch-relocs.c")
+; DWARF: DW_AT_comp_dir (".")
+; DWARF: DW_AT_name ("foo")
+; DWARF-NEXT: DW_AT_decl_file ("{{.*}}dwarf-loongarch-relocs.c")
+; DWARF-NEXT: DW_AT_decl_line (1)
+; DWARF-NEXT: DW_AT_type (0x00000032 "int")
+; DWARF: DW_AT_name ("int")
+; DWARF-NEXT: DW_AT_encoding (DW_ATE_signed)
+; DWARF-NEXT: DW_AT_byte_size (0x04)
+; DWARF: .debug_line contents:
+; DWARF-NEXT: debug_line[0x00000000]
+; DWARF-NEXT: Line table prologue:
+; DWARF-NEXT: total_length: {{.*}}
+; DWARF-NEXT: format: DWARF32
+; DWARF-NEXT: version: 5
+; DWARF-NEXT: address_size: 8
+; DWARF-NEXT: seg_select_size: 0
+; DWARF-NEXT: prologue_length: 0x0000003e
+; DWARF-NEXT: min_inst_length: 1
+; DWARF-NEXT: max_ops_per_inst: 1
+; DWARF-NEXT: default_is_stmt: 1
+; DWARF-NEXT: line_base: -5
+; DWARF-NEXT: line_range: 14
+; DWARF-NEXT: opcode_base: 13
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_copy] = 0
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_advance_pc] = 1
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_advance_line] = 1
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_set_file] = 1
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_set_column] = 1
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_negate_stmt] = 0
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_set_basic_block] = 0
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_const_add_pc] = 0
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_fixed_advance_pc] = 1
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_set_prologue_end] = 0
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_set_epilogue_begin] = 0
+; DWARF-NEXT: standard_opcode_lengths[DW_LNS_set_isa] = 1
+; DWARF-NEXT: include_directories[ 0] = "."
+; DWARF-NEXT: file_names[ 0]:
+; DWARF-NEXT: name: "dwarf-loongarch-relocs.c"
+; DWARF-NEXT: dir_index: 0
+; DWARF-NEXT: md5_checksum: f44d6d71bc4da58b4abe338ca507c007
+; DWARF-NEXT: source: "{{.*}}"
+; DWARF-EMPTY:
+; DWARF-NEXT: Address Line Column File ISA Discriminator OpIndex Flags
+; DWARF-NEXT: ------------------ ------ ------ ------ --- ------------- ------- -------------
+; DWARF-NEXT: 0x0000000000000000 2 0 0 0 0 0 is_stmt
+; DWARF-NEXT: 0x0000000000000010 3 3 0 0 0 0 is_stmt prologue_end
+; DWARF-NEXT: 0x0000000000000020 10 3 0 0 0 0 is_stmt
+; DWARF-NEXT: 0x000000000000002c 10 3 0 0 0 0 epilogue_begin
+; DWARF-NEXT: 0x0000000000000034 10 3 0 0 0 0 end_sequence
+
+; ModuleID = 'dwarf-loongarch-relocs.c'
+source_filename = "dwarf-loongarch-relocs.c"
+target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n64-S128"
+target triple = "loongarch64"
+
+; Function Attrs: noinline nounwind optnone
+define dso_local signext i32 @foo() #0 !dbg !8 {
+ call void asm sideeffect ".cfi_remember_state\0A\09.cfi_adjust_cfa_offset 16\0A\09nop\0A\09la.pcrel $$t0, sym\0A\09nop\0A\09.cfi_restore_state\0A\09", ""() #1, !dbg !12, !srcloc !13
+ ret i32 0, !dbg !14
+}
+
+attributes #0 = { noinline nounwind optnone "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="loongarch64" "target-features"="+64bit,+d,+f,+ual" }
+attributes #1 = { nounwind }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6}
+!llvm.ident = !{!7}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "dwarf-loongarch-relocs.c", directory: ".", checksumkind: CSK_MD5, checksum: "f44d6d71bc4da58b4abe338ca507c007", source: "int foo()\0A{\0A asm volatile(\0A \22.cfi_remember_state\\n\\t\22\0A \22.cfi_adjust_cfa_offset 16\\n\\t\22\0A \22nop\\n\\t\22\0A \22la.pcrel $t0, sym\\n\\t\22\0A \22nop\\n\\t\22\0A \22.cfi_restore_state\\n\\t\22);\0A return 0;\0A}\0A")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 7, !"direct-access-external-data", i32 0}
+!6 = !{i32 7, !"frame-pointer", i32 2}
+!7 = !{!"clang"}
+!8 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !9, scopeLine: 2, spFlags: DISPFlagDefinition, unit: !0)
+!9 = !DISubroutineType(types: !10)
+!10 = !{!11}
+!11 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!12 = !DILocation(line: 3, column: 3, scope: !8)
+!13 = !{i64 34, i64 56, i64 92, i64 106, i64 134, i64 148, i64 177}
+!14 = !DILocation(line: 10, column: 3, scope: !8)
diff --git a/llvm/test/DebugInfo/LoongArch/lit.local.cfg b/llvm/test/DebugInfo/LoongArch/lit.local.cfg
new file mode 100644
index 000000000000..77becb8eee90
--- /dev/null
+++ b/llvm/test/DebugInfo/LoongArch/lit.local.cfg
@@ -0,0 +1,2 @@
+if "LoongArch" not in config.root.targets:
+ config.unsupported = True
--
2.20.1

View File

@ -0,0 +1,362 @@
From 87f6adc2acf635a0a4c294217fb54c55eee3a06c Mon Sep 17 00:00:00 2001
From: Jinyang He <hejinyang@loongson.cn>
Date: Wed, 24 Jan 2024 09:17:49 +0800
Subject: [PATCH 08/14] [LoongArch] Insert nops and emit align reloc when
handle alignment directive (#72962)
Refer to RISCV, we will fix up the alignment if linker relaxation
changes code size and breaks alignment. Insert enough Nops and emit
R_LARCH_ALIGN relocation type so that linker could satisfy the alignment
by removing Nops.
It does so only in sections with the SHF_EXECINSTR flag.
In LoongArch psABI v2.30, R_LARCH_ALIGN requires symbol index. The
lowest 8 bits of addend represent alignment and the other bits of addend
represent the maximum number of bytes to emit.
(cherry picked from commit c51ab483e6c2d991a01179584705b83fbea1940d)
Change-Id: Iba30702c9dda378acfae0b1f1134926fa838a368
---
llvm/lib/MC/MCExpr.cpp | 2 +-
.../MCTargetDesc/LoongArchAsmBackend.cpp | 67 ++++++++++++++++
.../MCTargetDesc/LoongArchAsmBackend.h | 15 ++++
.../MCTargetDesc/LoongArchFixupKinds.h | 4 +-
.../Relocations/align-non-executable.s | 27 +++++++
.../MC/LoongArch/Relocations/relax-addsub.s | 15 +++-
.../MC/LoongArch/Relocations/relax-align.s | 79 +++++++++++++++++++
7 files changed, 205 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/MC/LoongArch/Relocations/align-non-executable.s
create mode 100644 llvm/test/MC/LoongArch/Relocations/relax-align.s
diff --git a/llvm/lib/MC/MCExpr.cpp b/llvm/lib/MC/MCExpr.cpp
index a561fed11179..79808a58d81c 100644
--- a/llvm/lib/MC/MCExpr.cpp
+++ b/llvm/lib/MC/MCExpr.cpp
@@ -711,7 +711,7 @@ static void AttemptToFoldSymbolOffsetDifference(
if (DF) {
Displacement += DF->getContents().size();
} else if (auto *AF = dyn_cast<MCAlignFragment>(FI);
- AF && Layout &&
+ AF && Layout && AF->hasEmitNops() &&
!Asm->getBackend().shouldInsertExtraNopBytesForCodeAlign(
*AF, Count)) {
Displacement += Asm->computeFragmentSize(*Layout, *AF);
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
index 8d82327b2e2b..8c482356402f 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.cpp
@@ -17,10 +17,13 @@
#include "llvm/MC/MCAssembler.h"
#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCELFObjectWriter.h"
+#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCSection.h"
#include "llvm/MC/MCValue.h"
#include "llvm/Support/Endian.h"
#include "llvm/Support/EndianStream.h"
#include "llvm/Support/LEB128.h"
+#include "llvm/Support/MathExtras.h"
#define DEBUG_TYPE "loongarch-asmbackend"
@@ -177,6 +180,70 @@ void LoongArchAsmBackend::applyFixup(const MCAssembler &Asm,
}
}
+// Linker relaxation may change code size. We have to insert Nops
+// for .align directive when linker relaxation enabled. So then Linker
+// could satisfy alignment by removing Nops.
+// The function returns the total Nops Size we need to insert.
+bool LoongArchAsmBackend::shouldInsertExtraNopBytesForCodeAlign(
+ const MCAlignFragment &AF, unsigned &Size) {
+ // Calculate Nops Size only when linker relaxation enabled.
+ if (!AF.getSubtargetInfo()->hasFeature(LoongArch::FeatureRelax))
+ return false;
+
+ // Ignore alignment if MaxBytesToEmit is less than the minimum Nop size.
+ const unsigned MinNopLen = 4;
+ if (AF.getMaxBytesToEmit() < MinNopLen)
+ return false;
+ Size = AF.getAlignment().value() - MinNopLen;
+ return AF.getAlignment() > MinNopLen;
+}
+
+// We need to insert R_LARCH_ALIGN relocation type to indicate the
+// position of Nops and the total bytes of the Nops have been inserted
+// when linker relaxation enabled.
+// The function inserts fixup_loongarch_align fixup which eventually will
+// transfer to R_LARCH_ALIGN relocation type.
+// The improved R_LARCH_ALIGN requires symbol index. The lowest 8 bits of
+// addend represent alignment and the other bits of addend represent the
+// maximum number of bytes to emit. The maximum number of bytes is zero
+// means ignore the emit limit.
+bool LoongArchAsmBackend::shouldInsertFixupForCodeAlign(
+ MCAssembler &Asm, const MCAsmLayout &Layout, MCAlignFragment &AF) {
+ // Insert the fixup only when linker relaxation enabled.
+ if (!AF.getSubtargetInfo()->hasFeature(LoongArch::FeatureRelax))
+ return false;
+
+ // Calculate total Nops we need to insert. If there are none to insert
+ // then simply return.
+ unsigned Count;
+ if (!shouldInsertExtraNopBytesForCodeAlign(AF, Count))
+ return false;
+
+ MCSection *Sec = AF.getParent();
+ MCContext &Ctx = Asm.getContext();
+ const MCExpr *Dummy = MCConstantExpr::create(0, Ctx);
+ // Create fixup_loongarch_align fixup.
+ MCFixup Fixup =
+ MCFixup::create(0, Dummy, MCFixupKind(LoongArch::fixup_loongarch_align));
+ const MCSymbolRefExpr *MCSym = getSecToAlignSym()[Sec];
+ if (MCSym == nullptr) {
+ // Create a symbol and make the value of symbol is zero.
+ MCSymbol *Sym = Ctx.createNamedTempSymbol("la-relax-align");
+ Sym->setFragment(&*Sec->getBeginSymbol()->getFragment());
+ Asm.registerSymbol(*Sym);
+ MCSym = MCSymbolRefExpr::create(Sym, Ctx);
+ getSecToAlignSym()[Sec] = MCSym;
+ }
+
+ uint64_t FixedValue = 0;
+ unsigned Lo = Log2_64(Count) + 1;
+ unsigned Hi = AF.getMaxBytesToEmit() >= Count ? 0 : AF.getMaxBytesToEmit();
+ MCValue Value = MCValue::get(MCSym, nullptr, Hi << 8 | Lo);
+ Asm.getWriter().recordRelocation(Asm, Layout, &AF, Fixup, Value, FixedValue);
+
+ return true;
+}
+
bool LoongArchAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
const MCFixup &Fixup,
const MCValue &Target) {
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
index 657f5ca5e731..71bbd003888a 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchAsmBackend.h
@@ -17,7 +17,9 @@
#include "MCTargetDesc/LoongArchFixupKinds.h"
#include "MCTargetDesc/LoongArchMCTargetDesc.h"
#include "llvm/MC/MCAsmBackend.h"
+#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCFixupKindInfo.h"
+#include "llvm/MC/MCSection.h"
#include "llvm/MC/MCSubtargetInfo.h"
namespace llvm {
@@ -27,6 +29,7 @@ class LoongArchAsmBackend : public MCAsmBackend {
uint8_t OSABI;
bool Is64Bit;
const MCTargetOptions &TargetOptions;
+ DenseMap<MCSection *, const MCSymbolRefExpr *> SecToAlignSym;
public:
LoongArchAsmBackend(const MCSubtargetInfo &STI, uint8_t OSABI, bool Is64Bit,
@@ -45,6 +48,15 @@ public:
uint64_t Value, bool IsResolved,
const MCSubtargetInfo *STI) const override;
+ // Return Size with extra Nop Bytes for alignment directive in code section.
+ bool shouldInsertExtraNopBytesForCodeAlign(const MCAlignFragment &AF,
+ unsigned &Size) override;
+
+ // Insert target specific fixup type for alignment directive in code section.
+ bool shouldInsertFixupForCodeAlign(MCAssembler &Asm,
+ const MCAsmLayout &Layout,
+ MCAlignFragment &AF) override;
+
bool shouldForceRelocation(const MCAssembler &Asm, const MCFixup &Fixup,
const MCValue &Target) override;
@@ -79,6 +91,9 @@ public:
std::unique_ptr<MCObjectTargetWriter>
createObjectTargetWriter() const override;
const MCTargetOptions &getTargetOptions() const { return TargetOptions; }
+ DenseMap<MCSection *, const MCSymbolRefExpr *> &getSecToAlignSym() {
+ return SecToAlignSym;
+ }
};
} // end namespace llvm
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
index 178fa6e5262b..78414408f21f 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
@@ -108,7 +108,9 @@ enum Fixups {
// 20-bit fixup corresponding to %gd_hi20(foo) for instruction lu12i.w.
fixup_loongarch_tls_gd_hi20,
// Generate an R_LARCH_RELAX which indicates the linker may relax here.
- fixup_loongarch_relax = FirstLiteralRelocationKind + ELF::R_LARCH_RELAX
+ fixup_loongarch_relax = FirstLiteralRelocationKind + ELF::R_LARCH_RELAX,
+ // Generate an R_LARCH_ALIGN which indicates the linker may fixup align here.
+ fixup_loongarch_align = FirstLiteralRelocationKind + ELF::R_LARCH_ALIGN,
};
} // end namespace LoongArch
} // end namespace llvm
diff --git a/llvm/test/MC/LoongArch/Relocations/align-non-executable.s b/llvm/test/MC/LoongArch/Relocations/align-non-executable.s
new file mode 100644
index 000000000000..47834acd9521
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/align-non-executable.s
@@ -0,0 +1,27 @@
+## A label difference separated by an alignment directive, when the
+## referenced symbols are in a non-executable section with instructions,
+## should generate ADD/SUB relocations.
+## https://github.com/llvm/llvm-project/pull/76552
+
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s \
+# RUN: | llvm-readobj -r - | FileCheck --check-prefixes=CHECK,RELAX %s
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s \
+# RUN: | llvm-readobj -r - | FileCheck %s
+
+.section ".dummy", "a"
+.L1:
+ la.pcrel $t0, sym
+.p2align 3
+.L2:
+.dword .L2 - .L1
+
+# CHECK: Relocations [
+# CHECK-NEXT: Section ({{.*}}) .rela.dummy {
+# CHECK-NEXT: 0x0 R_LARCH_PCALA_HI20 sym 0x0
+# RELAX-NEXT: 0x0 R_LARCH_RELAX - 0x0
+# CHECK-NEXT: 0x4 R_LARCH_PCALA_LO12 sym 0x0
+# RELAX-NEXT: 0x4 R_LARCH_RELAX - 0x0
+# RELAX-NEXT: 0x8 R_LARCH_ADD64 .L2 0x0
+# RELAX-NEXT: 0x8 R_LARCH_SUB64 .L1 0x0
+# CHECK-NEXT: }
+# CHECK-NEXT: ]
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
index cd01332afd0b..18e0ede5e293 100644
--- a/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
+++ b/llvm/test/MC/LoongArch/Relocations/relax-addsub.s
@@ -28,12 +28,23 @@
# RELAX: Relocations [
# RELAX-NEXT: Section ({{.*}}) .rela.text {
+# RELAX-NEXT: 0x4 R_LARCH_ALIGN {{.*}} 0x4
# RELAX-NEXT: 0x10 R_LARCH_PCALA_HI20 .L1 0x0
# RELAX-NEXT: 0x10 R_LARCH_RELAX - 0x0
# RELAX-NEXT: 0x14 R_LARCH_PCALA_LO12 .L1 0x0
# RELAX-NEXT: 0x14 R_LARCH_RELAX - 0x0
# RELAX-NEXT: }
# RELAX-NEXT: Section ({{.*}}) .rela.data {
+# RELAX-NEXT: 0x10 R_LARCH_ADD8 .L3 0x0
+# RELAX-NEXT: 0x10 R_LARCH_SUB8 .L2 0x0
+# RELAX-NEXT: 0x11 R_LARCH_ADD16 .L3 0x0
+# RELAX-NEXT: 0x11 R_LARCH_SUB16 .L2 0x0
+# RELAX-NEXT: 0x13 R_LARCH_ADD32 .L3 0x0
+# RELAX-NEXT: 0x13 R_LARCH_SUB32 .L2 0x0
+# RELAX-NEXT: 0x17 R_LARCH_ADD64 .L3 0x0
+# RELAX-NEXT: 0x17 R_LARCH_SUB64 .L2 0x0
+# RELAX-NEXT: 0x1F R_LARCH_ADD_ULEB128 .L3 0x0
+# RELAX-NEXT: 0x1F R_LARCH_SUB_ULEB128 .L2 0x0
# RELAX-NEXT: 0x20 R_LARCH_ADD8 .L4 0x0
# RELAX-NEXT: 0x20 R_LARCH_SUB8 .L3 0x0
# RELAX-NEXT: 0x21 R_LARCH_ADD16 .L4 0x0
@@ -57,7 +68,7 @@
# RELAX: Hex dump of section '.data':
# RELAX-NEXT: 0x00000000 04040004 00000004 00000000 00000004
-# RELAX-NEXT: 0x00000010 0c0c000c 0000000c 00000000 0000000c
+# RELAX-NEXT: 0x00000010 00000000 00000000 00000000 00000000
# RELAX-NEXT: 0x00000020 00000000 00000000 00000000 00000000
# RELAX-NEXT: 0x00000030 00000000 00000000 00000000 000000
@@ -78,7 +89,7 @@
.word .L2 - .L1
.dword .L2 - .L1
.uleb128 .L2 - .L1
-## TODO Handle alignment directive.
+## With relaxation, emit relocs because the .align makes the diff variable.
.byte .L3 - .L2
.short .L3 - .L2
.word .L3 - .L2
diff --git a/llvm/test/MC/LoongArch/Relocations/relax-align.s b/llvm/test/MC/LoongArch/Relocations/relax-align.s
new file mode 100644
index 000000000000..294fd9fb916c
--- /dev/null
+++ b/llvm/test/MC/LoongArch/Relocations/relax-align.s
@@ -0,0 +1,79 @@
+## The file testing Nop insertion with R_LARCH_ALIGN for relaxation.
+
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=-relax %s -o %t
+# RUN: llvm-objdump -d %t | FileCheck %s --check-prefix=INSTR
+# RUN: llvm-readobj -r %t | FileCheck %s --check-prefix=RELOC
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --mattr=+relax %s -o %t.r
+# RUN: llvm-objdump -d %t.r | FileCheck %s --check-prefixes=INSTR,RELAX-INSTR
+# RUN: llvm-readobj -r %t.r | FileCheck %s --check-prefixes=RELOC,RELAX-RELOC
+
+.text
+break 0
+# INSTR: break 0
+
+## Not emit R_LARCH_ALIGN if alignment directive is less than or equal to
+## minimum code alignment(a.k.a 4).
+.p2align 2
+.p2align 1
+.p2align 0
+
+## Not emit instructions if max emit bytes less than min nop size.
+.p2align 4, , 2
+
+## Not emit R_LARCH_ALIGN if alignment directive with specific padding value.
+## The behavior is the same as GNU assembler.
+break 1
+.p2align 4, 1
+# INSTR-NEXT: break 1
+# INSTR-COUNT-2: 01 01 01 01
+
+break 2
+.p2align 4, 1, 12
+# INSTR-NEXT: break 2
+# INSTR-COUNT-3: 01 01 01 01
+
+break 3
+.p2align 4
+# INSTR-NEXT: break 3
+# INSTR-COUNT-3: nop
+
+break 4
+.p2align 5
+.p2align 4
+# INSTR-NEXT: break 4
+# INSTR-COUNT-3: nop
+# RELAX-INSTR-COUNT-7: nop
+
+break 5
+.p2align 4, , 11
+# INSTR-NEXT: break 5
+# RELAX-INSTR-COUNT-3: nop
+
+break 6
+## Not emit the third parameter.
+.p2align 4, , 12
+# INSTR-NEXT: break 6
+# INSTR-NEXT: nop
+# INSTR-NEXT: nop
+# RELAX-INSTR-NEXT: nop
+
+ret
+# INSNR-NEXT: ret
+
+## Test the symbol index is different from .text.
+.section .text2, "ax"
+.p2align 4
+break 7
+
+# RELOC: Relocations [
+# RELAX-RELOC-NEXT: Section ({{.*}}) .rela.text {
+# RELAX-RELOC-NEXT: 0x24 R_LARCH_ALIGN .Lla-relax-align0 0x4
+# RELAX-RELOC-NEXT: 0x34 R_LARCH_ALIGN .Lla-relax-align0 0x5
+# RELAX-RELOC-NEXT: 0x50 R_LARCH_ALIGN .Lla-relax-align0 0x4
+# RELAX-RELOC-NEXT: 0x60 R_LARCH_ALIGN .Lla-relax-align0 0xB04
+# RELAX-RELOC-NEXT: 0x70 R_LARCH_ALIGN .Lla-relax-align0 0x4
+# RELAX-RELOC-NEXT: }
+# RELAX-RELOC-NEXT: Section ({{.*}}) .rela.text2 {
+# RELAX-RELOC-NEXT: 0x0 R_LARCH_ALIGN .Lla-relax-align1 0x4
+# RELAX-RELOC-NEXT: }
+# RELOC-NEXT: ]
--
2.20.1

View File

@ -0,0 +1,86 @@
From f51ee6c3468eacc82d3b3f09fcca381178bdc9e7 Mon Sep 17 00:00:00 2001
From: Weining Lu <luweining@loongson.cn>
Date: Wed, 24 Jan 2024 11:03:14 +0800
Subject: [PATCH 11/14] [test] Update dwarf-loongarch-relocs.ll
Address buildbot faiures:
http://45.33.8.238/macm1/77360/step_11.txt
http://45.33.8.238/linux/128902/step_12.txt
(cherry picked from commit baba7e4175b6ca21e83b1cf8229f29dbba02e979)
(cherry picked from commit c9e73cdd9a17f15ede120ea57657553f9e105eab)
Change-Id: I00aa1414f556f0ba5ff6bf6a879a6fc1fcfa49e0
---
.../LoongArch/dwarf-loongarch-relocs.ll | 37 ++++++++++++-------
1 file changed, 23 insertions(+), 14 deletions(-)
diff --git a/llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll b/llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll
index e03b4c1d34de..07443a62b933 100644
--- a/llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll
+++ b/llvm/test/DebugInfo/LoongArch/dwarf-loongarch-relocs.ll
@@ -1,19 +1,22 @@
; RUN: llc --filetype=obj --mtriple=loongarch64 --mattr=-relax %s -o %t.o
; RUN: llvm-readobj -r %t.o | FileCheck --check-prefixes=RELOCS-BOTH,RELOCS-NORL %s
-; RUN: llvm-objdump --source %t.o | FileCheck --check-prefix=SOURCE %s
-; RUN: llvm-dwarfdump --debug-info --debug-line %t.o | FileCheck --check-prefix=DWARF %s
+; RUN: llvm-objdump --source %t.o | FileCheck --check-prefixes=SOURCE,SOURCE-NORL %s
+; RUN: llvm-dwarfdump --debug-info --debug-line %t.o | FileCheck --check-prefixes=DWARF,DWARF-NORL %s
; RUN: llc --filetype=obj --mtriple=loongarch64 --mattr=+relax %s -o %t.r.o
; RUN: llvm-readobj -r %t.r.o | FileCheck --check-prefixes=RELOCS-BOTH,RELOCS-ENRL %s
-; RUN: llvm-objdump --source %t.r.o | FileCheck --check-prefix=SOURCE %s
-; RUN: llvm-dwarfdump --debug-info --debug-line %t.r.o | FileCheck --check-prefix=DWARF %s
+; RUN: llvm-objdump --source %t.r.o | FileCheck --check-prefixes=SOURCE,SOURCE-ENRL %s
+; RUN: llvm-dwarfdump --debug-info --debug-line %t.r.o | FileCheck --check-prefixes=DWARF,DWARF-ENRL %s
; RELOCS-BOTH: Relocations [
; RELOCS-BOTH-NEXT: Section ({{.*}}) .rela.text {
-; RELOCS-BOTH-NEXT: 0x14 R_LARCH_PCALA_HI20 sym 0x0
-; RELOCS-ENRL-NEXT: 0x14 R_LARCH_RELAX - 0x0
-; RELOCS-BOTH-NEXT: 0x18 R_LARCH_PCALA_LO12 sym 0x0
-; RELOCS-ENRL-NEXT: 0x18 R_LARCH_RELAX - 0x0
+; RELOCS-NORL-NEXT: 0x14 R_LARCH_PCALA_HI20 sym 0x0
+; RELOCS-NORL-NEXT: 0x18 R_LARCH_PCALA_LO12 sym 0x0
+; RELOCS-ENRL-NEXT: 0x0 R_LARCH_ALIGN .Lla-relax-align0 0x5
+; RELOCS-ENRL-NEXT: 0x30 R_LARCH_PCALA_HI20 sym 0x0
+; RELOCS-ENRL-NEXT: 0x30 R_LARCH_RELAX - 0x0
+; RELOCS-ENRL-NEXT: 0x34 R_LARCH_PCALA_LO12 sym 0x0
+; RELOCS-ENRL-NEXT: 0x34 R_LARCH_RELAX - 0x0
; RELOCS-BOTH-NEXT: }
; RELOCS-BOTH: Section ({{.*}}) .rela.debug_frame {
; RELOCS-NORL-NEXT: 0x1C R_LARCH_32 .debug_frame 0x0
@@ -36,7 +39,8 @@
; RELOCS-BOTH-NEXT: }
; RELOCS-BOTH-NEXT: ]
-; SOURCE: 0000000000000000 <foo>:
+; SOURCE-NORL: 0000000000000000 <foo>:
+; SOURCE-ENRL: 000000000000001c <foo>:
; SOURCE: ; {
; SOURCE: ; asm volatile(
; SOURCE: ; return 0;
@@ -87,11 +91,16 @@
; DWARF-EMPTY:
; DWARF-NEXT: Address Line Column File ISA Discriminator OpIndex Flags
; DWARF-NEXT: ------------------ ------ ------ ------ --- ------------- ------- -------------
-; DWARF-NEXT: 0x0000000000000000 2 0 0 0 0 0 is_stmt
-; DWARF-NEXT: 0x0000000000000010 3 3 0 0 0 0 is_stmt prologue_end
-; DWARF-NEXT: 0x0000000000000020 10 3 0 0 0 0 is_stmt
-; DWARF-NEXT: 0x000000000000002c 10 3 0 0 0 0 epilogue_begin
-; DWARF-NEXT: 0x0000000000000034 10 3 0 0 0 0 end_sequence
+; DWARF-NORL-NEXT: 0x0000000000000000 2 0 0 0 0 0 is_stmt
+; DWARF-NORL-NEXT: 0x0000000000000010 3 3 0 0 0 0 is_stmt prologue_end
+; DWARF-NORL-NEXT: 0x0000000000000020 10 3 0 0 0 0 is_stmt
+; DWARF-NORL-NEXT: 0x000000000000002c 10 3 0 0 0 0 epilogue_begin
+; DWARF-NORL-NEXT: 0x0000000000000034 10 3 0 0 0 0 end_sequence
+; DWARF-ENRL-NEXT: 0x000000000000001c 2 0 0 0 0 0 is_stmt
+; DWARF-ENRL-NEXT: 0x000000000000002c 3 3 0 0 0 0 is_stmt prologue_end
+; DWARF-ENRL-NEXT: 0x000000000000003c 10 3 0 0 0 0 is_stmt
+; DWARF-ENRL-NEXT: 0x0000000000000048 10 3 0 0 0 0 epilogue_begin
+; DWARF-ENRL-NEXT: 0x0000000000000050 10 3 0 0 0 0 end_sequence
; ModuleID = 'dwarf-loongarch-relocs.c'
source_filename = "dwarf-loongarch-relocs.c"
--
2.20.1

View File

@ -0,0 +1,53 @@
From 442b5109ccbabed1110c122c1ca92d4194ba632b Mon Sep 17 00:00:00 2001
From: Fangrui Song <i@maskray.me>
Date: Wed, 9 Aug 2023 21:42:18 -0700
Subject: [PATCH 13/14] [MC][test] Change ELF/uleb-ehtable.s Mach-O to use
private symbols in .uleb128 for label differences
On Mach-O, `.uleb128 A-B` where A and B are separated by a non-private symbol is invalid
(see D153167).
(cherry picked from commit 0a89bda4a8b756a00985e0965f7686b5ceb43295)
Change-Id: I92ed11d6913b8c781e29be6e8c642cf0a371910d
---
llvm/test/MC/ELF/uleb-ehtable.s | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/llvm/test/MC/ELF/uleb-ehtable.s b/llvm/test/MC/ELF/uleb-ehtable.s
index ca3f9e97bffc..6407223f36e7 100644
--- a/llvm/test/MC/ELF/uleb-ehtable.s
+++ b/llvm/test/MC/ELF/uleb-ehtable.s
@@ -1,7 +1,7 @@
// RUN: llvm-mc -filetype=obj -triple i686-pc-linux-gnu %s -o - | llvm-readobj -S --sd - | FileCheck %s -check-prefix=CHECK -check-prefix=ELF
// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o - | llvm-readobj -S --sd - | FileCheck %s -check-prefix=CHECK -check-prefix=ELF
-// RUN: llvm-mc -filetype=obj -triple i386-apple-darwin9 %s -o - | llvm-readobj -S --sd - | FileCheck %s -check-prefix=CHECK -check-prefix=MACHO
-// RUN: llvm-mc -filetype=obj -triple x86_64-apple-darwin9 %s -o - | llvm-readobj -S --sd - | FileCheck %s -check-prefix=CHECK -check-prefix=MACHO
+// RUN: llvm-mc -filetype=obj -triple i386-apple-darwin9 --defsym MACHO=1 %s -o - | llvm-readobj -S --sd - | FileCheck %s -check-prefix=CHECK -check-prefix=MACHO
+// RUN: llvm-mc -filetype=obj -triple x86_64-apple-darwin9 --defsym MACHO=1 %s -o - | llvm-readobj -S --sd - | FileCheck %s -check-prefix=CHECK -check-prefix=MACHO
// Test that we can assemble a GCC-like EH table that has 16381-16383 bytes of
// non-padding data between .ttbaseref and .ttbase. The assembler must insert
@@ -13,11 +13,20 @@
foo:
.byte 0xff // LPStart omitted
.byte 0x1 // TType encoding (uleb128)
+.ifdef MACHO
+ .uleb128 Lttbase-Lttbaseref
+Lttbaseref:
+.else
.uleb128 .ttbase-.ttbaseref
.ttbaseref:
+.endif
.fill 128*128-1, 1, 0xcd // call site and actions tables
.balign 4
+.ifdef MACHO
+Lttbase:
+.else
.ttbase:
+.endif
.byte 1, 2, 3, 4
// ELF: Name: .data
--
2.20.1

View File

@ -0,0 +1,135 @@
From 3b777f98a3997f338919af7ff1ef8a6fd07f76a0 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i@maskray.me>
Date: Wed, 16 Aug 2023 23:11:59 -0700
Subject: [PATCH 14/14] [Mips][MC] AttemptToFoldSymbolOffsetDifference: revert
isMicroMips special case
D52985/D57677 added a .gcc_except_table workaround, but the new behavior
doesn't match GNU assembler.
```
void foo();
int bar() {
foo();
try { throw 1; }
catch (int) { return 1; }
return 0;
}
clang --target=mipsel-linux-gnu -mmicromips -S a.cc
mipsel-linux-gnu-gcc -mmicromips -c a.s -o gnu.o
.uleb128 ($cst_end0)-($cst_begin0) // bit 0 is not forced to 1
.uleb128 ($func_begin0)-($func_begin0) // bit 0 is not forced to 1
```
I have inspected `.gcc_except_table` output by `mipsel-linux-gnu-gcc -mmicromips -c a.cc`.
The `.uleb128` values are not forced to set the least significant bit.
In addition, D57677's adjustment (even->odd) to CodeGen/Mips/micromips-b-range.ll is wrong.
PC-relative `.long func - .` values will differ from GNU assembler as well.
The original intention of D52985 seems unclear to me. I think whatever
goal it wants to achieve should be moved to an upper layer.
This isMicroMips special case has caused problems to fix MCAssembler::relaxLEB to use evaluateAsAbsolute instead of evaluateKnownAbsolute,
which is needed to proper support R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128.
Differential Revision: https://reviews.llvm.org/D157655
(cherry picked from commit 4c89277095ee7cda3d20e0f5f18b384212069778)
Change-Id: Iedd73e0c61856c30fde442309fc16d4327829f1a
---
llvm/lib/MC/MCExpr.cpp | 5 -----
llvm/test/CodeGen/Mips/micromips-b-range.ll | 8 ++++----
llvm/test/CodeGen/Mips/micromips-gcc-except-table.ll | 2 +-
llvm/test/DebugInfo/Mips/eh_frame.ll | 4 ++--
4 files changed, 7 insertions(+), 12 deletions(-)
diff --git a/llvm/lib/MC/MCExpr.cpp b/llvm/lib/MC/MCExpr.cpp
index 79808a58d81c..c9ff1865cf91 100644
--- a/llvm/lib/MC/MCExpr.cpp
+++ b/llvm/lib/MC/MCExpr.cpp
@@ -611,11 +611,6 @@ static void AttemptToFoldSymbolOffsetDifference(
if (Asm->isThumbFunc(&SA))
Addend |= 1;
- // If symbol is labeled as micromips, we set low-bit to ensure
- // correct offset in .gcc_except_table
- if (Asm->getBackend().isMicroMips(&SA))
- Addend |= 1;
-
// Clear the symbol expr pointers to indicate we have folded these
// operands.
A = B = nullptr;
diff --git a/llvm/test/CodeGen/Mips/micromips-b-range.ll b/llvm/test/CodeGen/Mips/micromips-b-range.ll
index 064afff3da0e..81d1c04208cc 100644
--- a/llvm/test/CodeGen/Mips/micromips-b-range.ll
+++ b/llvm/test/CodeGen/Mips/micromips-b-range.ll
@@ -13,7 +13,7 @@
; CHECK-NEXT: 1e: fb fd 00 00 sw $ra, 0($sp)
; CHECK-NEXT: 22: 41 a1 00 01 lui $1, 1
; CHECK-NEXT: 26: 40 60 00 02 bal 0x2e <foo+0x2e>
-; CHECK-NEXT: 2a: 30 21 04 69 addiu $1, $1, 1129
+; CHECK-NEXT: 2a: 30 21 04 68 addiu $1, $1, 1128
; CHECK-NEXT: 2e: 00 3f 09 50 addu $1, $ra, $1
; CHECK-NEXT: 32: ff fd 00 00 lw $ra, 0($sp)
; CHECK-NEXT: 36: 00 01 0f 3c jr $1
@@ -27,7 +27,7 @@
; CHECK-NEXT: 56: fb fd 00 00 sw $ra, 0($sp)
; CHECK-NEXT: 5a: 41 a1 00 01 lui $1, 1
; CHECK-NEXT: 5e: 40 60 00 02 bal 0x66 <foo+0x66>
-; CHECK-NEXT: 62: 30 21 04 5d addiu $1, $1, 1117
+; CHECK-NEXT: 62: 30 21 04 5c addiu $1, $1, 1116
; CHECK-NEXT: 66: 00 3f 09 50 addu $1, $ra, $1
; CHECK-NEXT: 6a: ff fd 00 00 lw $ra, 0($sp)
; CHECK-NEXT: 6e: 00 01 0f 3c jr $1
@@ -39,7 +39,7 @@
; CHECK-NEXT: 86: fb fd 00 00 sw $ra, 0($sp)
; CHECK-NEXT: 8a: 41 a1 00 01 lui $1, 1
; CHECK-NEXT: 8e: 40 60 00 02 bal 0x96 <foo+0x96>
-; CHECK-NEXT: 92: 30 21 04 2d addiu $1, $1, 1069
+; CHECK-NEXT: 92: 30 21 04 2c addiu $1, $1, 1068
; CHECK-NEXT: 96: 00 3f 09 50 addu $1, $ra, $1
; CHECK-NEXT: 9a: ff fd 00 00 lw $ra, 0($sp)
; CHECK-NEXT: 9e: 00 01 0f 3c jr $1
@@ -51,7 +51,7 @@
; CHECK-NEXT: 10476: fb fd 00 00 sw $ra, 0($sp)
; CHECK-NEXT: 1047a: 41 a1 00 01 lui $1, 1
; CHECK-NEXT: 1047e: 40 60 00 02 bal 0x10486 <foo+0x10486>
-; CHECK-NEXT: 10482: 30 21 04 01 addiu $1, $1, 1025
+; CHECK-NEXT: 10482: 30 21 04 00 addiu $1, $1, 1024
; CHECK-NEXT: 10486: 00 3f 09 50 addu $1, $ra, $1
; CHECK-NEXT: 1048a: ff fd 00 00 lw $ra, 0($sp)
; CHECK-NEXT: 1048e: 00 01 0f 3c jr $1
diff --git a/llvm/test/CodeGen/Mips/micromips-gcc-except-table.ll b/llvm/test/CodeGen/Mips/micromips-gcc-except-table.ll
index 2b63aff01574..20d64fc216b7 100644
--- a/llvm/test/CodeGen/Mips/micromips-gcc-except-table.ll
+++ b/llvm/test/CodeGen/Mips/micromips-gcc-except-table.ll
@@ -1,7 +1,7 @@
; RUN: llc -mtriple=mips-linux-gnu -mcpu=mips32r2 -mattr=+micromips -O3 -filetype=obj < %s | llvm-objdump -s -j .gcc_except_table - | FileCheck %s
; CHECK: Contents of section .gcc_except_table:
-; CHECK-NEXT: 0000 ff9b1501 0c011100 00110e1f 011f1800
+; CHECK-NEXT: 0000 ff9b1501 0c001000 00100e1e 011e1800
; CHECK-NEXT: 0010 00010000 00000000
@_ZTIi = external constant ptr
diff --git a/llvm/test/DebugInfo/Mips/eh_frame.ll b/llvm/test/DebugInfo/Mips/eh_frame.ll
index 506e5b87892b..60d4dc76777e 100644
--- a/llvm/test/DebugInfo/Mips/eh_frame.ll
+++ b/llvm/test/DebugInfo/Mips/eh_frame.ll
@@ -26,9 +26,9 @@
; CHECK-READELF-PIC-NEXT: R_MIPS_PC32
; CHECK-READELF-NEXT: .gcc_except_table
-; EXCEPT-TABLE-STATIC: 0000 ff9b1501 0c011500 00150e23 01231e00 ...........#.#..
+; EXCEPT-TABLE-STATIC: 0000 ff9b1501 0c001400 00140e22 01221e00 ..........."."..
; EXCEPT-TABLE-STATIC: 0010 00010000 00000000
-; EXCEPT-TABLE-PIC: 0000 ff9b1501 0c012d00 002d133f 013f2a00 ......-..-.?.?*.
+; EXCEPT-TABLE-PIC: 0000 ff9b1501 0c002c00 002c123e 013e2a00 ......,..,.>.>*.
; EXCEPT-TABLE-PIC: 0010 00010000 00000000 ........
@_ZTIi = external constant ptr
--
2.20.1

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,37 @@
From 58f4b115f97435efa1df1baa3247791a13444c00 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov@redhat.com>
Date: Tue, 26 Sep 2023 16:51:40 +0200
Subject: [PATCH 1/3] [Bitcode] Add some missing GetTypeByID failure checks
Print an error instead of crashing.
Fixes https://github.com/llvm/llvm-project/issues/67388.
---
llvm/lib/Bitcode/Reader/MetadataLoader.cpp | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
index 0a9a80688a41..2f024899b451 100644
--- a/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
+++ b/llvm/lib/Bitcode/Reader/MetadataLoader.cpp
@@ -1315,7 +1315,7 @@ Error MetadataLoader::MetadataLoaderImpl::parseOneMetadata(
unsigned TyID = Record[0];
Type *Ty = Callbacks.GetTypeByID(TyID);
- if (Ty->isMetadataTy() || Ty->isVoidTy()) {
+ if (!Ty || Ty->isMetadataTy() || Ty->isVoidTy()) {
dropRecord();
break;
}
@@ -1366,7 +1366,7 @@ Error MetadataLoader::MetadataLoaderImpl::parseOneMetadata(
unsigned TyID = Record[0];
Type *Ty = Callbacks.GetTypeByID(TyID);
- if (Ty->isMetadataTy() || Ty->isVoidTy())
+ if (!Ty || Ty->isMetadataTy() || Ty->isVoidTy())
return error("Invalid record");
Value *V = ValueList.getValueFwdRef(Record[1], Ty, TyID,
--
2.33.0

View File

@ -0,0 +1,74 @@
From 678cf3a36644847cac4b0be2d919aba77416088a Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov@redhat.com>
Date: Mon, 04 Mar 2024 07:00:37 +0800
Subject: [PATCH] [Backport][X86][Inline] Skip inline asm in inlining target
feature check
When inlining across functions with different target features, we
perform roughly two checks:
1. The caller features must be a superset of the callee features.
2. Calls in the callee cannot use types where the target features would
change the call ABI (e.g. by changing whether something is passed in a
zmm or two ymm registers). The latter check is very crude right now.
The latter check currently also catches inline asm "calls". I believe
that inline asm should be excluded from this check, as it is independent
from the usual call ABI, and instead governed by the inline asm
constraint string.
---
.../lib/Target/X86/X86TargetTransformInfo.cpp | 4 +++
.../Inline/X86/call-abi-compatibility.ll | 26 +++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index 129a2646d..9c7954230 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -6046,6 +6046,10 @@ bool X86TTIImpl::areInlineCompatible(const Function *Caller,
for (const Instruction &I : instructions(Callee)) {
if (const auto *CB = dyn_cast<CallBase>(&I)) {
+ // Having more target features is fine for inline ASM.
+ if (CB->isInlineAsm())
+ continue;
+
SmallVector<Type *, 8> Types;
for (Value *Arg : CB->args())
Types.push_back(Arg->getType());
diff --git a/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll b/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll
index 3a30980fe..6f582cab2 100644
--- a/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll
+++ b/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll
@@ -93,3 +93,29 @@ define internal void @caller_not_avx4() {
}
declare i64 @caller_unknown_simple(i64)
+
+; This call should get inlined, because the callee only contains
+; inline ASM, not real calls.
+define <8 x i64> @caller_inline_asm(ptr %p0, i64 %k, ptr %p1, ptr %p2) #0 {
+; CHECK-LABEL: define {{[^@]+}}@caller_inline_asm
+; CHECK-SAME: (ptr [[P0:%.*]], i64 [[K:%.*]], ptr [[P1:%.*]], ptr [[P2:%.*]]) #[[ATTR2:[0-9]+]] {
+; CHECK-NEXT: [[SRC_I:%.*]] = load <8 x i64>, ptr [[P0]], align 64
+; CHECK-NEXT: [[A_I:%.*]] = load <8 x i64>, ptr [[P1]], align 64
+; CHECK-NEXT: [[B_I:%.*]] = load <8 x i64>, ptr [[P2]], align 64
+; CHECK-NEXT: [[TMP1:%.*]] = call <8 x i64> asm "vpaddb\09$($3, $2, $0 {$1}", "=v,^Yk,v,v,0,~{dirflag},~{fpsr},~{flags}"(i64 [[K]], <8 x i64> [[A_I]], <8 x i64> [[B_I]], <8 x i64> [[SRC_I]])
+; CHECK-NEXT: ret <8 x i64> [[TMP1]]
+;
+ %call = call <8 x i64> @callee_inline_asm(ptr %p0, i64 %k, ptr %p1, ptr %p2)
+ ret <8 x i64> %call
+}
+
+define internal <8 x i64> @callee_inline_asm(ptr %p0, i64 %k, ptr %p1, ptr %p2) #1 {
+ %src = load <8 x i64>, ptr %p0, align 64
+ %a = load <8 x i64>, ptr %p1, align 64
+ %b = load <8 x i64>, ptr %p2, align 64
+ %1 = tail call <8 x i64> asm "vpaddb\09$($3, $2, $0 {$1}", "=v,^Yk,v,v,0,~{dirflag},~{fpsr},~{flags}"(i64 %k, <8 x i64> %a, <8 x i64> %b, <8 x i64> %src) #2
+ ret <8 x i64> %1
+}
+
+attributes #0 = { "min-legal-vector-width"="512" "target-features"="+avx,+avx2,+avx512bw,+avx512dq,+avx512f,+cmov,+crc32,+cx8,+evex512,+f16c,+fma,+fxsr,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave" "tune-cpu"="generic" }
+attributes #1 = { "min-legal-vector-width"="512" "target-features"="+avx,+avx2,+avx512bw,+avx512f,+cmov,+crc32,+cx8,+evex512,+f16c,+fma,+fxsr,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave" "tune-cpu"="generic" }
--
2.33.0

View File

@ -0,0 +1,87 @@
From 4aec2da60ce3f639e31d81406c09d5c88b3b8f53 Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo@fhahn.com>
Date: Wed, 20 Dec 2023 16:56:15 +0100
Subject: [PATCH 2/3] [ARM] Check all terms in emitPopInst when clearing
Restored for LR. (#75527)
emitPopInst checks a single function exit MBB. If other paths also exit
the function and any of there terminators uses LR implicitly, it is not
save to clear the Restored bit.
Check all terminators for the function before clearing Restored.
This fixes a mis-compile in outlined-fn-may-clobber-lr-in-caller.ll
where the machine-outliner previously introduced BLs that clobbered LR
which in turn is used by the tail call return.
Alternative to #73553
---
llvm/lib/Target/ARM/ARMFrameLowering.cpp | 30 +++++++++++++++++++++---
llvm/lib/Target/ARM/ARMFrameLowering.h | 3 +++
2 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 4496d4928ebe..650f4650eef0 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -1645,9 +1645,6 @@ void ARMFrameLowering::emitPopInst(MachineBasicBlock &MBB,
// Fold the return instruction into the LDM.
DeleteRet = true;
LdmOpc = AFI->isThumbFunction() ? ARM::t2LDMIA_RET : ARM::LDMIA_RET;
- // We 'restore' LR into PC so it is not live out of the return block:
- // Clear Restored bit.
- Info.setRestored(false);
}
// If NoGap is true, pop consecutive registers and then leave the rest
@@ -2769,6 +2766,33 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
AFI->setLRIsSpilled(SavedRegs.test(ARM::LR));
}
+void ARMFrameLowering::processFunctionBeforeFrameFinalized(
+ MachineFunction &MF, RegScavenger *RS) const {
+ TargetFrameLowering::processFunctionBeforeFrameFinalized(MF, RS);
+
+ MachineFrameInfo &MFI = MF.getFrameInfo();
+ if (!MFI.isCalleeSavedInfoValid())
+ return;
+
+ // Check if all terminators do not implicitly use LR. Then we can 'restore' LR
+ // into PC so it is not live out of the return block: Clear the Restored bit
+ // in that case.
+ for (CalleeSavedInfo &Info : MFI.getCalleeSavedInfo()) {
+ if (Info.getReg() != ARM::LR)
+ continue;
+ if (all_of(MF, [](const MachineBasicBlock &MBB) {
+ return all_of(MBB.terminators(), [](const MachineInstr &Term) {
+ return !Term.isReturn() || Term.getOpcode() == ARM::LDMIA_RET ||
+ Term.getOpcode() == ARM::t2LDMIA_RET ||
+ Term.getOpcode() == ARM::tPOP_RET;
+ });
+ })) {
+ Info.setRestored(false);
+ break;
+ }
+ }
+}
+
void ARMFrameLowering::getCalleeSaves(const MachineFunction &MF,
BitVector &SavedRegs) const {
TargetFrameLowering::getCalleeSaves(MF, SavedRegs);
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.h b/llvm/lib/Target/ARM/ARMFrameLowering.h
index 16f2ce6bea6f..8d2b8beb9a58 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.h
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.h
@@ -59,6 +59,9 @@ public:
void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs,
RegScavenger *RS) const override;
+ void processFunctionBeforeFrameFinalized(
+ MachineFunction &MF, RegScavenger *RS = nullptr) const override;
+
void adjustForSegmentedStacks(MachineFunction &MF,
MachineBasicBlock &MBB) const override;
--
2.33.0

View File

@ -0,0 +1,116 @@
From 369bfc8ea8c0a9da51b4bd964f0045cb389c3c2f Mon Sep 17 00:00:00 2001
From: ostannard <oliver.stannard@arm.com>
Date: Mon, 26 Feb 2024 12:23:25 +0000
Subject: [PATCH 3/3] [ARM] Update IsRestored for LR based on all returns
(#82745)
PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based
on all of the return instructions in the function, not just one.
However, there is also code in ARMLoadStoreOptimizer which changes
return instructions, but it set IsRestored based on the one instruction
it changed, not the whole function.
The fix is to factor out the code added in #75527, and also call it from
ARMLoadStoreOptimizer if it made a change to return instructions.
Fixes #80287.
---
llvm/lib/Target/ARM/ARMFrameLowering.cpp | 11 +++++----
llvm/lib/Target/ARM/ARMFrameLowering.h | 4 ++++
llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp | 23 ++++++++-----------
3 files changed, 21 insertions(+), 17 deletions(-)
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 650f4650eef0..008ba4e5924b 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -2766,10 +2766,7 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
AFI->setLRIsSpilled(SavedRegs.test(ARM::LR));
}
-void ARMFrameLowering::processFunctionBeforeFrameFinalized(
- MachineFunction &MF, RegScavenger *RS) const {
- TargetFrameLowering::processFunctionBeforeFrameFinalized(MF, RS);
-
+void ARMFrameLowering::updateLRRestored(MachineFunction &MF) {
MachineFrameInfo &MFI = MF.getFrameInfo();
if (!MFI.isCalleeSavedInfoValid())
return;
@@ -2793,6 +2790,12 @@ void ARMFrameLowering::processFunctionBeforeFrameFinalized(
}
}
+void ARMFrameLowering::processFunctionBeforeFrameFinalized(
+ MachineFunction &MF, RegScavenger *RS) const {
+ TargetFrameLowering::processFunctionBeforeFrameFinalized(MF, RS);
+ updateLRRestored(MF);
+}
+
void ARMFrameLowering::getCalleeSaves(const MachineFunction &MF,
BitVector &SavedRegs) const {
TargetFrameLowering::getCalleeSaves(MF, SavedRegs);
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.h b/llvm/lib/Target/ARM/ARMFrameLowering.h
index 8d2b8beb9a58..3c7358d8cd53 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.h
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.h
@@ -59,6 +59,10 @@ public:
void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs,
RegScavenger *RS) const override;
+ /// Update the IsRestored flag on LR if it is spilled, based on the return
+ /// instructions.
+ static void updateLRRestored(MachineFunction &MF);
+
void processFunctionBeforeFrameFinalized(
MachineFunction &MF, RegScavenger *RS = nullptr) const override;
diff --git a/llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp b/llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
index 93db983b92c0..37d9e1addd1e 100644
--- a/llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
@@ -2062,17 +2062,6 @@ bool ARMLoadStoreOpt::MergeReturnIntoLDM(MachineBasicBlock &MBB) {
MO.setReg(ARM::PC);
PrevMI.copyImplicitOps(*MBB.getParent(), *MBBI);
MBB.erase(MBBI);
- // We now restore LR into PC so it is not live-out of the return block
- // anymore: Clear the CSI Restored bit.
- MachineFrameInfo &MFI = MBB.getParent()->getFrameInfo();
- // CSI should be fixed after PrologEpilog Insertion
- assert(MFI.isCalleeSavedInfoValid() && "CSI should be valid");
- for (CalleeSavedInfo &Info : MFI.getCalleeSavedInfo()) {
- if (Info.getReg() == ARM::LR) {
- Info.setRestored(false);
- break;
- }
- }
return true;
}
}
@@ -2120,14 +2109,22 @@ bool ARMLoadStoreOpt::runOnMachineFunction(MachineFunction &Fn) {
isThumb2 = AFI->isThumb2Function();
isThumb1 = AFI->isThumbFunction() && !isThumb2;
- bool Modified = false;
+ bool Modified = false, ModifiedLDMReturn = false;
for (MachineBasicBlock &MBB : Fn) {
Modified |= LoadStoreMultipleOpti(MBB);
if (STI->hasV5TOps() && !AFI->shouldSignReturnAddress())
- Modified |= MergeReturnIntoLDM(MBB);
+ ModifiedLDMReturn |= MergeReturnIntoLDM(MBB);
if (isThumb1)
Modified |= CombineMovBx(MBB);
}
+ Modified |= ModifiedLDMReturn;
+
+ // If we merged a BX instruction into an LDM, we need to re-calculate whether
+ // LR is restored. This check needs to consider the whole function, not just
+ // the instruction(s) we changed, because there may be other BX returns which
+ // still need LR to be restored.
+ if (ModifiedLDMReturn)
+ ARMFrameLowering::updateLRRestored(Fn);
Allocator.DestroyAll();
return Modified;
--
2.33.0

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,30 @@
From 5721be433ddee5f60d4a9434df43a023f1ec4c0e Mon Sep 17 00:00:00 2001
From: wangqiang <wangqiang1@kylinos.cn>
Date: Sun, 28 Apr 2024 14:30:34 +0800
Subject: [PATCH] Update llvm-lit config to support build_for_openeuler
---
llvm/cmake/modules/HandleLLVMOptions.cmake | 7 +++++++
1 files changed, 7 insertions(+)
diff --git a/llvm/cmake/modules/HandleLLVMOptions.cmake b/llvm/cmake/modules/HandleLLVMOptions.cmake
index 76723be69..c6f5569af 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -97,6 +97,13 @@ if( LLVM_ENABLE_ASSERTIONS )
set(LLVM_ENABLE_CLASSIC_FLANG 0)
endif()
+option(BUILD_FOR_OPENEULER "Build support for openeuler" OFF)
+if(BUILD_FOR_OPENEULER)
+ set(BUILD_FOR_OPENEULER 1)
+else()
+ set(BUILD_FOR_OPENEULER 0)
+endif()
+
if(LLVM_ENABLE_EXPENSIVE_CHECKS)
add_compile_definitions(EXPENSIVE_CHECKS)
--
2.33.0

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,28 @@
From 4673c2eaba443678c4dc6ae74ea16a489b415fed Mon Sep 17 00:00:00 2001
From: liyunfei <liyunfei33@huawei.com>
Date: Tue, 19 Sep 2023 09:31:43 +0800
Subject: [PATCH] Prevent environment variables from exceeding NAME_MAX
---
llvm/lib/Support/Unix/Path.inc | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/Support/Unix/Path.inc b/llvm/lib/Support/Unix/Path.inc
index 2ae7c6dc..f13f3165 100644
--- a/llvm/lib/Support/Unix/Path.inc
+++ b/llvm/lib/Support/Unix/Path.inc
@@ -1427,8 +1427,12 @@ static const char *getEnvTempDir() {
// variable.
const char *EnvironmentVariables[] = {"TMPDIR", "TMP", "TEMP", "TEMPDIR"};
for (const char *Env : EnvironmentVariables) {
- if (const char *Dir = std::getenv(Env))
+ if (const char *Dir = std::getenv(Env)) {
+ if(std::strlen(Dir) > NAME_MAX) {
+ continue;
+ }
return Dir;
+ }
}
return nullptr;
--

View File

@ -0,0 +1,517 @@
From cac43828d26b178807d194b4bd7c5df69603df29 Mon Sep 17 00:00:00 2001
From: xiajingze <xiajingze1@huawei.com>
Date: Wed, 31 Jul 2024 18:37:29 +0800
Subject: [PATCH] [AArch64] Support HiSilicon's HIP09 Processor
Signed-off-by: xiajingze <xiajingze1@huawei.com>
---
llvm/cmake/modules/HandleLLVMOptions.cmake | 8 ++
.../llvm/TargetParser/AArch64TargetParser.h | 7 ++
llvm/lib/Target/AArch64/AArch64.td | 36 +++++++
.../lib/Target/AArch64/AArch64MacroFusion.cpp | 55 +++++++++++
llvm/lib/Target/AArch64/AArch64Subtarget.cpp | 9 ++
llvm/lib/Target/AArch64/AArch64Subtarget.h | 9 +-
llvm/lib/Target/CMakeLists.txt | 4 +
llvm/lib/TargetParser/Host.cpp | 3 +
llvm/test/CodeGen/AArch64/cpus-hip09.ll | 11 +++
.../CodeGen/AArch64/macro-fusion-mvnclz.mir | 20 ++++
.../AArch64/misched-fusion-lit-hip09.ll | 73 ++++++++++++++
llvm/test/CodeGen/AArch64/remat-hip09.ll | 18 ++++
llvm/test/lit.site.cfg.py.in | 4 +
llvm/unittests/TargetParser/Host.cpp | 5 +
.../TargetParser/TargetParserTest.cpp | 16 +++
15 files changed, 277 insertions(+), 1 deletion(-)
create mode 100644 llvm/test/CodeGen/AArch64/cpus-hip09.ll
create mode 100644 llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir
create mode 100644 llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll
create mode 100644 llvm/test/CodeGen/AArch64/remat-hip09.ll
diff --git a/llvm/cmake/modules/HandleLLVMOptions.cmake b/llvm/cmake/modules/HandleLLVMOptions.cmake
index 8be5d4ba5..74e68e25d 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -112,6 +112,14 @@ else()
set(LLVM_ENABLE_AUTOTUNER 0)
endif()
+option(LLVM_ENABLE_AARCH64_HIP09 "Enable HIP09 Processor" ON)
+if(LLVM_ENABLE_AARCH64_HIP09)
+ set(LLVM_ENABLE_AARCH64_HIP09 1)
+ add_definitions( -DENABLE_AARCH64_HIP09 )
+else()
+ set(LLVM_ENABLE_AARCH64_HIP09 0)
+endif()
+
if(LLVM_ENABLE_EXPENSIVE_CHECKS)
add_compile_definitions(EXPENSIVE_CHECKS)
diff --git a/llvm/include/llvm/TargetParser/AArch64TargetParser.h b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
index dc4cdfa8e..07cd2fcbb 100644
--- a/llvm/include/llvm/TargetParser/AArch64TargetParser.h
+++ b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
@@ -542,6 +542,13 @@ inline constexpr CpuInfo CpuInfos[] = {
(AArch64::AEK_FP16 | AArch64::AEK_RAND | AArch64::AEK_SM4 |
AArch64::AEK_SHA3 | AArch64::AEK_SHA2 | AArch64::AEK_AES |
AArch64::AEK_MTE | AArch64::AEK_SB | AArch64::AEK_SSBS)},
+#if defined(ENABLE_AARCH64_HIP09)
+ {"hip09", ARMV8_5A,
+ (AArch64::AEK_AES | AArch64::AEK_SM4 | AArch64::AEK_SHA2 |
+ AArch64::AEK_SHA3 | AArch64::AEK_FP16 | AArch64::AEK_PROFILE |
+ AArch64::AEK_FP16FML | AArch64::AEK_SVE | AArch64::AEK_I8MM |
+ AArch64::AEK_F32MM | AArch64::AEK_F64MM | AArch64::AEK_BF16)},
+#endif
};
// An alias for a CPU.
diff --git a/llvm/lib/Target/AArch64/AArch64.td b/llvm/lib/Target/AArch64/AArch64.td
index 8f50af4b7..c8bfd770f 100644
--- a/llvm/lib/Target/AArch64/AArch64.td
+++ b/llvm/lib/Target/AArch64/AArch64.td
@@ -296,6 +296,12 @@ def FeatureFuseAddSub2RegAndConstOne : SubtargetFeature<
"fuse-addsub-2reg-const1", "HasFuseAddSub2RegAndConstOne", "true",
"CPU fuses (a + b + 1) and (a - b - 1)">;
+#ifdef ENABLE_AARCH64_HIP09
+def FeatureFuseMvnClz : SubtargetFeature<
+ "fuse-mvn-clz", "HasFuseMvnClz", "true",
+ "CPU fuses mvn+clz operations">;
+#endif
+
def FeatureDisableLatencySchedHeuristic : SubtargetFeature<
"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true",
"Disable latency scheduling heuristic">;
@@ -1205,6 +1211,21 @@ def TuneTSV110 : SubtargetFeature<"tsv110", "ARMProcFamily", "TSV110",
FeatureFuseAES,
FeaturePostRAScheduler]>;
+#ifdef ENABLE_AARCH64_HIP09
+def TuneHIP09 : SubtargetFeature<"hip09", "ARMProcFamily", "HIP09",
+ "HiSilicon HIP-09 processors", [
+ FeatureCustomCheapAsMoveHandling,
+ FeatureExperimentalZeroingPseudos,
+ FeatureFuseAES,
+ FeatureLSLFast,
+ FeatureAscendStoreAddress,
+ FeatureCmpBccFusion,
+ FeatureArithmeticBccFusion,
+ FeatureFuseLiterals,
+ FeatureFuseMvnClz,
+ FeaturePostRAScheduler]>;
+#endif
+
def TuneAmpere1 : SubtargetFeature<"ampere1", "ARMProcFamily", "Ampere1",
"Ampere Computing Ampere-1 processors", [
FeaturePostRAScheduler,
@@ -1359,6 +1380,14 @@ def ProcessorFeatures {
list<SubtargetFeature> TSV110 = [HasV8_2aOps, FeatureCrypto, FeatureFPARMv8,
FeatureNEON, FeaturePerfMon, FeatureSPE,
FeatureFullFP16, FeatureFP16FML, FeatureDotProd];
+#ifdef ENABLE_AARCH64_HIP09
+ list<SubtargetFeature> HIP09 = [HasV8_5aOps, FeatureBF16, FeatureCrypto, FeatureFPARMv8,
+ FeatureMatMulInt8, FeatureMatMulFP32, FeatureMatMulFP64,
+ FeatureNEON, FeaturePerfMon, FeatureRandGen, FeatureSPE,
+ FeatureFullFP16, FeatureFP16FML, FeatureDotProd,
+ FeatureJS, FeatureComplxNum, FeatureSHA3, FeatureSM4,
+ FeatureSVE];
+#endif
list<SubtargetFeature> Ampere1 = [HasV8_6aOps, FeatureNEON, FeaturePerfMon,
FeatureSSBS, FeatureRandGen, FeatureSB,
FeatureSHA2, FeatureSHA3, FeatureAES];
@@ -1464,8 +1493,15 @@ def : ProcessorModel<"thunderx2t99", ThunderX2T99Model,
// Marvell ThunderX3T110 Processors.
def : ProcessorModel<"thunderx3t110", ThunderX3T110Model,
ProcessorFeatures.ThunderX3T110, [TuneThunderX3T110]>;
+
+// HiSilicon Processors.
def : ProcessorModel<"tsv110", TSV110Model, ProcessorFeatures.TSV110,
[TuneTSV110]>;
+#ifdef ENABLE_AARCH64_HIP09
+// FIXME: HiSilicon HIP09 is currently modeled as a Cortex-A57.
+def : ProcessorModel<"hip09", CortexA57Model, ProcessorFeatures.HIP09,
+ [TuneHIP09]>;
+#endif
// Support cyclone as an alias for apple-a7 so we can still LTO old bitcode.
def : ProcessorModel<"cyclone", CycloneModel, ProcessorFeatures.AppleA7,
diff --git a/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp b/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp
index 05d60872b..4963ec350 100644
--- a/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp
+++ b/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp
@@ -51,6 +51,12 @@ static bool isArithmeticBccPair(const MachineInstr *FirstMI,
case AArch64::SUBSXrr:
case AArch64::BICSWrr:
case AArch64::BICSXrr:
+#if defined(ENABLE_AARCH64_HIP09)
+ case AArch64::ADCSWr:
+ case AArch64::ADCSXr:
+ case AArch64::SBCSWr:
+ case AArch64::SBCSXr:
+#endif
return true;
case AArch64::ADDSWrs:
case AArch64::ADDSXrs:
@@ -183,6 +189,20 @@ static bool isLiteralsPair(const MachineInstr *FirstMI,
SecondMI.getOperand(3).getImm() == 16))
return true;
+#if defined(ENABLE_AARCH64_HIP09)
+ // 32 bit immediate.
+ if ((FirstMI == nullptr || FirstMI->getOpcode() == AArch64::MOVNWi) &&
+ (SecondMI.getOpcode() == AArch64::MOVKWi &&
+ SecondMI.getOperand(3).getImm() == 16))
+ return true;
+
+ // Lower half of 64 bit immediate.
+ if ((FirstMI == nullptr || FirstMI->getOpcode() == AArch64::MOVNXi) &&
+ (SecondMI.getOpcode() == AArch64::MOVKWi &&
+ SecondMI.getOperand(3).getImm() == 16))
+ return true;
+#endif
+
// Upper half of 64 bit immediate.
if ((FirstMI == nullptr ||
(FirstMI->getOpcode() == AArch64::MOVKXi &&
@@ -437,6 +457,37 @@ static bool isAddSub2RegAndConstOnePair(const MachineInstr *FirstMI,
return false;
}
+#if defined(ENABLE_AARCH64_HIP09)
+static bool isMvnClzPair(const MachineInstr *FirstMI,
+ const MachineInstr &SecondMI) {
+ // HIP09 supports fusion of MVN + CLZ.
+ // The CLZ can be fused with MVN and make execution faster.
+ // And the fusion is not allowed for shifted forms.
+ //
+ // Instruction alias info:
+ // 1. MVN <Wd>, <Wm>{, <shift> #<amount>} is equivalent to
+ // ORN <Wd>, WZR, <Wm>{, <shift> #<amount>}
+ // 2. MVN <Xd>, <Xm>{, <shift> #<amount>} is equivalent to
+ // ORN <Xd>, XZR, <Xm>{, <shift> #<amount>}
+ // Assume the 1st instr to be a wildcard if it is unspecified.
+ if ((FirstMI == nullptr ||
+ ((FirstMI->getOpcode() == AArch64::ORNWrs) &&
+ (FirstMI->getOperand(1).getReg() == AArch64::WZR) &&
+ (!AArch64InstrInfo::hasShiftedReg(*FirstMI)))) &&
+ (SecondMI.getOpcode() == AArch64::CLZWr))
+ return true;
+
+ if ((FirstMI == nullptr ||
+ ((FirstMI->getOpcode() == AArch64::ORNXrs) &&
+ (FirstMI->getOperand(1).getReg() == AArch64::XZR) &&
+ (!AArch64InstrInfo::hasShiftedReg(*FirstMI)))) &&
+ (SecondMI.getOpcode() == AArch64::CLZXr))
+ return true;
+
+ return false;
+}
+#endif
+
/// \brief Check if the instr pair, FirstMI and SecondMI, should be fused
/// together. Given SecondMI, when FirstMI is unspecified, then check if
/// SecondMI may be part of a fused pair at all.
@@ -472,6 +523,10 @@ static bool shouldScheduleAdjacent(const TargetInstrInfo &TII,
if (ST.hasFuseAddSub2RegAndConstOne() &&
isAddSub2RegAndConstOnePair(FirstMI, SecondMI))
return true;
+#if defined(ENABLE_AARCH64_HIP09)
+ if (ST.hasFuseMvnClz() && isMvnClzPair(FirstMI, SecondMI))
+ return true;
+#endif
return false;
}
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 450e27b8a..ddf22364c 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -266,6 +266,15 @@ void AArch64Subtarget::initializeProperties() {
PrefFunctionAlignment = Align(16);
PrefLoopAlignment = Align(4);
break;
+#if defined(ENABLE_AARCH64_HIP09)
+ case HIP09:
+ CacheLineSize = 64;
+ PrefFunctionAlignment = Align(16);
+ PrefLoopAlignment = Align(4);
+ VScaleForTuning = 2;
+ DefaultSVETFOpts = TailFoldingOpts::Simple;
+ break;
+#endif
case ThunderX3T110:
CacheLineSize = 64;
PrefFunctionAlignment = Align(16);
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.h b/llvm/lib/Target/AArch64/AArch64Subtarget.h
index 5e20d1646..5f481f4f9 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -87,7 +87,10 @@ public:
ThunderXT83,
ThunderXT88,
ThunderX3T110,
- TSV110
+ TSV110,
+#if defined(ENABLE_AARCH64_HIP09)
+ HIP09
+#endif
};
protected:
@@ -239,7 +242,11 @@ public:
bool hasFusion() const {
return hasArithmeticBccFusion() || hasArithmeticCbzFusion() ||
hasFuseAES() || hasFuseArithmeticLogic() || hasFuseCCSelect() ||
+#if defined(ENABLE_AARCH64_HIP09)
+ hasFuseAdrpAdd() || hasFuseLiterals() || hasFuseMvnClz();
+#else
hasFuseAdrpAdd() || hasFuseLiterals();
+#endif
}
unsigned getMaxInterleaveFactor() const { return MaxInterleaveFactor; }
diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt
index 2739233f9..501ce1f2f 100644
--- a/llvm/lib/Target/CMakeLists.txt
+++ b/llvm/lib/Target/CMakeLists.txt
@@ -2,6 +2,10 @@ list(APPEND LLVM_COMMON_DEPENDS intrinsics_gen)
list(APPEND LLVM_TABLEGEN_FLAGS -I ${LLVM_MAIN_SRC_DIR}/lib/Target)
+if(LLVM_ENABLE_AARCH64_HIP09)
+ list(APPEND LLVM_TABLEGEN_FLAGS "-DENABLE_AARCH64_HIP09")
+endif()
+
add_llvm_component_library(LLVMTarget
Target.cpp
TargetIntrinsicInfo.cpp
diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
index d11dc605e..8b23be02e 100644
--- a/llvm/lib/TargetParser/Host.cpp
+++ b/llvm/lib/TargetParser/Host.cpp
@@ -257,6 +257,9 @@ StringRef sys::detail::getHostCPUNameForARM(StringRef ProcCpuinfoContent) {
// contents are specified in the various processor manuals.
return StringSwitch<const char *>(Part)
.Case("0xd01", "tsv110")
+#if defined(ENABLE_AARCH64_HIP09)
+ .Case("0xd02", "hip09")
+#endif
.Default("generic");
if (Implementer == "0x51") // Qualcomm Technologies, Inc.
diff --git a/llvm/test/CodeGen/AArch64/cpus-hip09.ll b/llvm/test/CodeGen/AArch64/cpus-hip09.ll
new file mode 100644
index 000000000..dcf32e4dc
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/cpus-hip09.ll
@@ -0,0 +1,11 @@
+; REQUIRES: enable_enable_aarch64_hip09
+; This tests that llc accepts all valid AArch64 CPUs
+
+; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=hip09 2>&1 | FileCheck %s
+
+; CHECK-NOT: {{.*}} is not a recognized processor for this target
+; INVALID: {{.*}} is not a recognized processor for this target
+
+define i32 @f(i64 %z) {
+ ret i32 0
+}
diff --git a/llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir b/llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir
new file mode 100644
index 000000000..64bf15937
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir
@@ -0,0 +1,20 @@
+# REQUIRES: enable_enable_aarch64_hip09
+# RUN: llc -o - %s -mtriple=aarch64-- -mattr=+fuse-mvn-clz -run-pass postmisched | FileCheck %s --check-prefixes=CHECK,FUSION
+# RUN: llc -o - %s -mtriple=aarch64-- -mattr=-fuse-mvn-clz -run-pass postmisched | FileCheck %s --check-prefixes=CHECK,NOFUSION
+---
+# CHECK-LABEL: name: fuse-mvn-clz
+# CHECK: $w2 = ORNWrs $wzr, $w1, 0
+# FUSION: $w0 = CLZWr killed renamable $w2
+# CHECK: $w3 = ADDWri killed renamable $w1, 1, 0
+# NOFUSION: $w0 = CLZWr killed renamable $w2
+name: fuse-mvn-clz
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $w0, $w1, $w2, $w3
+
+ $w2 = ORNWrs $wzr, $w1, 0
+ $w3 = ADDWri killed renamable $w1, 1, 0
+ $w0 = CLZWr killed renamable $w2
+ RET undef $lr, implicit $w0
+...
diff --git a/llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll b/llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll
new file mode 100644
index 000000000..d67fa5b43
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll
@@ -0,0 +1,73 @@
+; REQUIRES: enable_enable_aarch64_hip09
+; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=hip09 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECKFUSE-HIP09
+
+@g = common local_unnamed_addr global ptr null, align 8
+
+define dso_local ptr @litp(i32 %a, i32 %b) {
+entry:
+ %add = add nsw i32 %b, %a
+ %idx.ext = sext i32 %add to i64
+ %add.ptr = getelementptr i8, ptr @litp, i64 %idx.ext
+ store ptr %add.ptr, ptr @g, align 8
+ ret ptr %add.ptr
+
+; CHECK-LABEL: litp:
+; CHECK: adrp [[R:x[0-9]+]], litp
+; CHECKFUSE-NEXT: add {{x[0-9]+}}, [[R]], :lo12:litp
+}
+
+define dso_local ptr @litp_tune_generic(i32 %a, i32 %b) "tune-cpu"="generic" {
+entry:
+ %add = add nsw i32 %b, %a
+ %idx.ext = sext i32 %add to i64
+ %add.ptr = getelementptr i8, ptr @litp_tune_generic, i64 %idx.ext
+ store ptr %add.ptr, ptr @g, align 8
+ ret ptr %add.ptr
+
+; CHECK-LABEL: litp_tune_generic:
+; CHECK: adrp [[R:x[0-9]+]], litp_tune_generic
+; CHECK-NEXT: add {{x[0-9]+}}, [[R]], :lo12:litp_tune_generic
+}
+
+define dso_local i32 @liti(i32 %a, i32 %b) {
+entry:
+ %add = add i32 %a, -262095121
+ %add1 = add i32 %add, %b
+ ret i32 %add1
+
+; CHECK-LABEL: liti:
+; CHECK: mov [[R:w[0-9]+]], {{#[0-9]+}}
+; CHECKDONT-NEXT: add {{w[0-9]+}}, {{w[0-9]+}}, {{w[0-9]+}}
+; CHECKFUSE-NEXT: movk [[R]], {{#[0-9]+}}, lsl #16
+; CHECKFUSE-HIP09: movk [[R]], {{#[0-9]+}}, lsl #16
+}
+
+; Function Attrs: norecurse nounwind readnone
+define dso_local i64 @litl(i64 %a, i64 %b) {
+entry:
+ %add = add i64 %a, 2208998440489107183
+ %add1 = add i64 %add, %b
+ ret i64 %add1
+
+; CHECK-LABEL: litl:
+; CHECK: mov [[R:x[0-9]+]], {{#[0-9]+}}
+; CHECKDONT-NEXT: add {{x[0-9]+}}, {{x[0-9]+}}, {{x[0-9]+}}
+; CHECK-NEXT: movk [[R]], {{#[0-9]+}}, lsl #16
+; CHECK: movk [[R]], {{#[0-9]+}}, lsl #32
+; CHECK-NEXT: movk [[R]], {{#[0-9]+}}, lsl #48
+}
+
+; Function Attrs: norecurse nounwind readnone
+define dso_local double @litf() {
+entry:
+ ret double 0x400921FB54442D18
+
+; CHECK-LABEL: litf:
+; CHECK-DONT: adrp [[ADDR:x[0-9]+]], [[CSTLABEL:.LCP.*]]
+; CHECK-DONT-NEXT: ldr {{d[0-9]+}}, {{[[]}}[[ADDR]], :lo12:[[CSTLABEL]]{{[]]}}
+; CHECKFUSE-HIP09: mov [[R:x[0-9]+]], #11544
+; CHECKFUSE-HIP09: movk [[R]], #21572, lsl #16
+; CHECKFUSE-HIP09: movk [[R]], #8699, lsl #32
+; CHECKFUSE-HIP09: movk [[R]], #16393, lsl #48
+; CHECKFUSE-HIP09: fmov {{d[0-9]+}}, [[R]]
+}
diff --git a/llvm/test/CodeGen/AArch64/remat-hip09.ll b/llvm/test/CodeGen/AArch64/remat-hip09.ll
new file mode 100644
index 000000000..aec0d18ae
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/remat-hip09.ll
@@ -0,0 +1,18 @@
+; REQUIRES: enable_enable_aarch64_hip09
+; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=hip09 -o - %s | FileCheck %s
+
+%X = type { i64, i64, i64 }
+declare void @f(ptr)
+define void @t() {
+entry:
+ %tmp = alloca %X
+ call void @f(ptr %tmp)
+; CHECK: add x0, sp, #8
+; CHECK-NOT: mov
+; CHECK-NEXT: bl f
+ call void @f(ptr %tmp)
+; CHECK: add x0, sp, #8
+; CHECK-NOT: mov
+; CHECK-NEXT: bl f
+ ret void
+}
diff --git a/llvm/test/lit.site.cfg.py.in b/llvm/test/lit.site.cfg.py.in
index 20c1ecca1..6145a514f 100644
--- a/llvm/test/lit.site.cfg.py.in
+++ b/llvm/test/lit.site.cfg.py.in
@@ -64,9 +64,13 @@ config.have_llvm_driver = @LLVM_TOOL_LLVM_DRIVER_BUILD@
config.use_classic_flang = @LLVM_ENABLE_CLASSIC_FLANG@
config.enable_enable_autotuner = @LLVM_ENABLE_AUTOTUNER@
+config.enable_enable_aarch64_hip09 = @LLVM_ENABLE_AARCH64_HIP09@
import lit.llvm
lit.llvm.initialize(lit_config, config)
+if config.enable_enable_aarch64_hip09:
+ config.available_features.add("enable_enable_aarch64_hip09")
+
# Let the main config do the real work.
lit_config.load_config(
config, os.path.join(config.llvm_src_root, "test/lit.cfg.py"))
diff --git a/llvm/unittests/TargetParser/Host.cpp b/llvm/unittests/TargetParser/Host.cpp
index 452d0326c..4b4c81514 100644
--- a/llvm/unittests/TargetParser/Host.cpp
+++ b/llvm/unittests/TargetParser/Host.cpp
@@ -250,6 +250,11 @@ CPU part : 0x0a1
EXPECT_EQ(sys::detail::getHostCPUNameForARM("CPU implementer : 0x48\n"
"CPU part : 0xd01"),
"tsv110");
+#if defined(ENABLE_AARCH64_HIP09)
+ EXPECT_EQ(sys::detail::getHostCPUNameForARM("CPU implementer : 0x48\n"
+ "CPU part : 0xd02"),
+ "hip09");
+#endif
// Verify A64FX.
const std::string A64FXProcCpuInfo = R"(
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 741d5a2d4..94e0047e5 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -1421,6 +1421,18 @@ INSTANTIATE_TEST_SUITE_P(
AArch64::AEK_PROFILE | AArch64::AEK_FP16 |
AArch64::AEK_FP16FML | AArch64::AEK_DOTPROD,
"8.2-A"),
+#if defined(ENABLE_AARCH64_HIP09)
+ ARMCPUTestParams(
+ "hip09", "armv8.5-a", "crypto-neon-fp-armv8",
+ AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_SIMD |
+ AArch64::AEK_RAS | AArch64::AEK_LSE | AArch64::AEK_RDM |
+ AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | AArch64::AEK_AES |
+ AArch64::AEK_SM4 | AArch64::AEK_SHA2 | AArch64::AEK_SHA3 |
+ AArch64::AEK_FP16 | AArch64::AEK_PROFILE |
+ AArch64::AEK_FP16FML | AArch64::AEK_SVE | AArch64::AEK_I8MM |
+ AArch64::AEK_F32MM | AArch64::AEK_F64MM | AArch64::AEK_BF16,
+ "8.5-A"),
+#endif
ARMCPUTestParams("a64fx", "armv8.2-a", "crypto-neon-fp-armv8",
AArch64::AEK_CRC | AArch64::AEK_AES |
AArch64::AEK_SHA2 | AArch64::AEK_FP |
@@ -1437,7 +1449,11 @@ INSTANTIATE_TEST_SUITE_P(
"8.2-A")));
// Note: number of CPUs includes aliases.
+#if defined(ENABLE_AARCH64_HIP09)
+static constexpr unsigned NumAArch64CPUArchs = 63;
+#else
static constexpr unsigned NumAArch64CPUArchs = 62;
+#endif
TEST(TargetParserTest, testAArch64CPUArchList) {
SmallVector<StringRef, NumAArch64CPUArchs> List;
--
2.19.1

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,30 @@
From cf9d549f2c40d548587f8d2d3cda0d32f13c9256 Mon Sep 17 00:00:00 2001
From: Temperatureblock <102174059+Temperature-block@users.noreply.github.com>
Date: Mon, 12 Aug 2024 20:06:58 +0530
Subject: [PATCH] Simple check to ignore Inline asm fwait insertion (#101686)
Just a simple check to ignore Inline asm fwait insertion
Fixes #101613
---
llvm/lib/Target/X86/X86InstrInfo.cpp | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 10a0ccdcb023..e615fa09608c 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -2947,6 +2947,11 @@ static bool isX87Reg(unsigned Reg) {
/// check if the instruction is X87 instruction
bool X86::isX87Instruction(MachineInstr &MI) {
+ // Call and inlineasm defs X87 register, so we special case it here because
+ // otherwise calls are incorrectly flagged as x87 instructions
+ // as a result.
+ if (MI.isInlineAsm())
+ return false;
for (const MachineOperand &MO : MI.operands()) {
if (!MO.isReg())
continue;
--
Gitee

View File

@ -0,0 +1,24 @@
From 2513e90fd317bbe5854a06213e43cdf7029c3ee2 Mon Sep 17 00:00:00 2001
From: liyunfei <liyunfei33@huawei.com>
Date: Tue, 5 Nov 2024 18:18:19 +0800
Subject: [PATCH] Add arch restriction for BiSheng Autotuner
BiSheng Autotuner only support x86_64 and aarch64 temporarily.
Signed-off-by: liyunfei <liyunfei33@huawei.com>
---
llvm/test/AutoTuning/lit.local.cfg | 2 ++
1 file changed, 2 insertions(+)
diff --git a/llvm/test/AutoTuning/lit.local.cfg b/llvm/test/AutoTuning/lit.local.cfg
index 13b4927257ab..c48c2c9eab6f 100644
--- a/llvm/test/AutoTuning/lit.local.cfg
+++ b/llvm/test/AutoTuning/lit.local.cfg
@@ -1,2 +1,4 @@
if not config.enable_enable_autotuner:
config.unsupported = True
+if config.host_arch not in ["x86", "X86", 'x86_64', 'aarch64']:
+ config.unsupported = True
\ No newline at end of file
--
Gitee

View File

@ -0,0 +1,514 @@
From 42b0d16ab1ced5720e017fa9f6059c32489ab1bd Mon Sep 17 00:00:00 2001
From: xiajingze <xiajingze1@huawei.com>
Date: Wed, 9 Oct 2024 17:13:49 +0800
Subject: [PATCH] [AArch64] Delete hip09 macro
Signed-off-by: xiajingze <xiajingze1@huawei.com>
---
llvm/cmake/modules/HandleLLVMOptions.cmake | 8 --
.../llvm/TargetParser/AArch64TargetParser.h | 2 -
llvm/lib/Target/AArch64/AArch64.td | 8 --
.../lib/Target/AArch64/AArch64MacroFusion.cpp | 8 --
llvm/lib/Target/AArch64/AArch64Subtarget.cpp | 2 -
llvm/lib/Target/AArch64/AArch64Subtarget.h | 6 --
llvm/lib/Target/CMakeLists.txt | 4 -
llvm/lib/TargetParser/Host.cpp | 2 -
llvm/test/CodeGen/AArch64/cpus-hip09.ll | 11 ---
llvm/test/CodeGen/AArch64/cpus.ll | 1 +
.../CodeGen/AArch64/macro-fusion-mvnclz.mir | 1 -
.../AArch64/misched-fusion-lit-hip09.ll | 73 --------------
.../CodeGen/AArch64/misched-fusion-lit.ll | 7 ++
llvm/test/CodeGen/AArch64/remat-hip09.ll | 18 ----
llvm/test/CodeGen/AArch64/remat.ll | 1 +
llvm/test/lit.site.cfg.py.in | 4 -
llvm/unittests/TargetParser/Host.cpp | 2 -
.../TargetParser/TargetParserTest.cpp | 6 --
18 files changed, 9 insertions(+), 155 deletions(-)
delete mode 100644 llvm/test/CodeGen/AArch64/cpus-hip09.ll
delete mode 100644 llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll
delete mode 100644 llvm/test/CodeGen/AArch64/remat-hip09.ll
diff --git a/llvm/cmake/modules/HandleLLVMOptions.cmake b/llvm/cmake/modules/HandleLLVMOptions.cmake
index 74e68e25d85c..8be5d4ba52c2 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -112,14 +112,6 @@ else()
set(LLVM_ENABLE_AUTOTUNER 0)
endif()
-option(LLVM_ENABLE_AARCH64_HIP09 "Enable HIP09 Processor" ON)
-if(LLVM_ENABLE_AARCH64_HIP09)
- set(LLVM_ENABLE_AARCH64_HIP09 1)
- add_definitions( -DENABLE_AARCH64_HIP09 )
-else()
- set(LLVM_ENABLE_AARCH64_HIP09 0)
-endif()
-
if(LLVM_ENABLE_EXPENSIVE_CHECKS)
add_compile_definitions(EXPENSIVE_CHECKS)
diff --git a/llvm/include/llvm/TargetParser/AArch64TargetParser.h b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
index 07cd2fcbb68d..8b25cce0abdc 100644
--- a/llvm/include/llvm/TargetParser/AArch64TargetParser.h
+++ b/llvm/include/llvm/TargetParser/AArch64TargetParser.h
@@ -542,13 +542,11 @@ inline constexpr CpuInfo CpuInfos[] = {
(AArch64::AEK_FP16 | AArch64::AEK_RAND | AArch64::AEK_SM4 |
AArch64::AEK_SHA3 | AArch64::AEK_SHA2 | AArch64::AEK_AES |
AArch64::AEK_MTE | AArch64::AEK_SB | AArch64::AEK_SSBS)},
-#if defined(ENABLE_AARCH64_HIP09)
{"hip09", ARMV8_5A,
(AArch64::AEK_AES | AArch64::AEK_SM4 | AArch64::AEK_SHA2 |
AArch64::AEK_SHA3 | AArch64::AEK_FP16 | AArch64::AEK_PROFILE |
AArch64::AEK_FP16FML | AArch64::AEK_SVE | AArch64::AEK_I8MM |
AArch64::AEK_F32MM | AArch64::AEK_F64MM | AArch64::AEK_BF16)},
-#endif
};
// An alias for a CPU.
diff --git a/llvm/lib/Target/AArch64/AArch64.td b/llvm/lib/Target/AArch64/AArch64.td
index c8bfd770f55f..fdb931a0fe6c 100644
--- a/llvm/lib/Target/AArch64/AArch64.td
+++ b/llvm/lib/Target/AArch64/AArch64.td
@@ -296,11 +296,9 @@ def FeatureFuseAddSub2RegAndConstOne : SubtargetFeature<
"fuse-addsub-2reg-const1", "HasFuseAddSub2RegAndConstOne", "true",
"CPU fuses (a + b + 1) and (a - b - 1)">;
-#ifdef ENABLE_AARCH64_HIP09
def FeatureFuseMvnClz : SubtargetFeature<
"fuse-mvn-clz", "HasFuseMvnClz", "true",
"CPU fuses mvn+clz operations">;
-#endif
def FeatureDisableLatencySchedHeuristic : SubtargetFeature<
"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true",
@@ -1211,7 +1209,6 @@ def TuneTSV110 : SubtargetFeature<"tsv110", "ARMProcFamily", "TSV110",
FeatureFuseAES,
FeaturePostRAScheduler]>;
-#ifdef ENABLE_AARCH64_HIP09
def TuneHIP09 : SubtargetFeature<"hip09", "ARMProcFamily", "HIP09",
"HiSilicon HIP-09 processors", [
FeatureCustomCheapAsMoveHandling,
@@ -1224,7 +1221,6 @@ def TuneHIP09 : SubtargetFeature<"hip09", "ARMProcFamily", "HIP09",
FeatureFuseLiterals,
FeatureFuseMvnClz,
FeaturePostRAScheduler]>;
-#endif
def TuneAmpere1 : SubtargetFeature<"ampere1", "ARMProcFamily", "Ampere1",
"Ampere Computing Ampere-1 processors", [
@@ -1380,14 +1376,12 @@ def ProcessorFeatures {
list<SubtargetFeature> TSV110 = [HasV8_2aOps, FeatureCrypto, FeatureFPARMv8,
FeatureNEON, FeaturePerfMon, FeatureSPE,
FeatureFullFP16, FeatureFP16FML, FeatureDotProd];
-#ifdef ENABLE_AARCH64_HIP09
list<SubtargetFeature> HIP09 = [HasV8_5aOps, FeatureBF16, FeatureCrypto, FeatureFPARMv8,
FeatureMatMulInt8, FeatureMatMulFP32, FeatureMatMulFP64,
FeatureNEON, FeaturePerfMon, FeatureRandGen, FeatureSPE,
FeatureFullFP16, FeatureFP16FML, FeatureDotProd,
FeatureJS, FeatureComplxNum, FeatureSHA3, FeatureSM4,
FeatureSVE];
-#endif
list<SubtargetFeature> Ampere1 = [HasV8_6aOps, FeatureNEON, FeaturePerfMon,
FeatureSSBS, FeatureRandGen, FeatureSB,
FeatureSHA2, FeatureSHA3, FeatureAES];
@@ -1497,11 +1491,9 @@ def : ProcessorModel<"thunderx3t110", ThunderX3T110Model,
// HiSilicon Processors.
def : ProcessorModel<"tsv110", TSV110Model, ProcessorFeatures.TSV110,
[TuneTSV110]>;
-#ifdef ENABLE_AARCH64_HIP09
// FIXME: HiSilicon HIP09 is currently modeled as a Cortex-A57.
def : ProcessorModel<"hip09", CortexA57Model, ProcessorFeatures.HIP09,
[TuneHIP09]>;
-#endif
// Support cyclone as an alias for apple-a7 so we can still LTO old bitcode.
def : ProcessorModel<"cyclone", CycloneModel, ProcessorFeatures.AppleA7,
diff --git a/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp b/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp
index 4963ec350db2..44daa06468c5 100644
--- a/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp
+++ b/llvm/lib/Target/AArch64/AArch64MacroFusion.cpp
@@ -51,12 +51,10 @@ static bool isArithmeticBccPair(const MachineInstr *FirstMI,
case AArch64::SUBSXrr:
case AArch64::BICSWrr:
case AArch64::BICSXrr:
-#if defined(ENABLE_AARCH64_HIP09)
case AArch64::ADCSWr:
case AArch64::ADCSXr:
case AArch64::SBCSWr:
case AArch64::SBCSXr:
-#endif
return true;
case AArch64::ADDSWrs:
case AArch64::ADDSXrs:
@@ -189,7 +187,6 @@ static bool isLiteralsPair(const MachineInstr *FirstMI,
SecondMI.getOperand(3).getImm() == 16))
return true;
-#if defined(ENABLE_AARCH64_HIP09)
// 32 bit immediate.
if ((FirstMI == nullptr || FirstMI->getOpcode() == AArch64::MOVNWi) &&
(SecondMI.getOpcode() == AArch64::MOVKWi &&
@@ -201,7 +198,6 @@ static bool isLiteralsPair(const MachineInstr *FirstMI,
(SecondMI.getOpcode() == AArch64::MOVKWi &&
SecondMI.getOperand(3).getImm() == 16))
return true;
-#endif
// Upper half of 64 bit immediate.
if ((FirstMI == nullptr ||
@@ -457,7 +453,6 @@ static bool isAddSub2RegAndConstOnePair(const MachineInstr *FirstMI,
return false;
}
-#if defined(ENABLE_AARCH64_HIP09)
static bool isMvnClzPair(const MachineInstr *FirstMI,
const MachineInstr &SecondMI) {
// HIP09 supports fusion of MVN + CLZ.
@@ -486,7 +481,6 @@ static bool isMvnClzPair(const MachineInstr *FirstMI,
return false;
}
-#endif
/// \brief Check if the instr pair, FirstMI and SecondMI, should be fused
/// together. Given SecondMI, when FirstMI is unspecified, then check if
@@ -523,10 +517,8 @@ static bool shouldScheduleAdjacent(const TargetInstrInfo &TII,
if (ST.hasFuseAddSub2RegAndConstOne() &&
isAddSub2RegAndConstOnePair(FirstMI, SecondMI))
return true;
-#if defined(ENABLE_AARCH64_HIP09)
if (ST.hasFuseMvnClz() && isMvnClzPair(FirstMI, SecondMI))
return true;
-#endif
return false;
}
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index ddf22364c78e..1aff7e30a0cf 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -266,7 +266,6 @@ void AArch64Subtarget::initializeProperties() {
PrefFunctionAlignment = Align(16);
PrefLoopAlignment = Align(4);
break;
-#if defined(ENABLE_AARCH64_HIP09)
case HIP09:
CacheLineSize = 64;
PrefFunctionAlignment = Align(16);
@@ -274,7 +273,6 @@ void AArch64Subtarget::initializeProperties() {
VScaleForTuning = 2;
DefaultSVETFOpts = TailFoldingOpts::Simple;
break;
-#endif
case ThunderX3T110:
CacheLineSize = 64;
PrefFunctionAlignment = Align(16);
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.h b/llvm/lib/Target/AArch64/AArch64Subtarget.h
index 5f481f4f976a..8a1cebe96894 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -88,9 +88,7 @@ public:
ThunderXT88,
ThunderX3T110,
TSV110,
-#if defined(ENABLE_AARCH64_HIP09)
HIP09
-#endif
};
protected:
@@ -242,11 +240,7 @@ public:
bool hasFusion() const {
return hasArithmeticBccFusion() || hasArithmeticCbzFusion() ||
hasFuseAES() || hasFuseArithmeticLogic() || hasFuseCCSelect() ||
-#if defined(ENABLE_AARCH64_HIP09)
hasFuseAdrpAdd() || hasFuseLiterals() || hasFuseMvnClz();
-#else
- hasFuseAdrpAdd() || hasFuseLiterals();
-#endif
}
unsigned getMaxInterleaveFactor() const { return MaxInterleaveFactor; }
diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt
index 501ce1f2fe53..2739233f9ccb 100644
--- a/llvm/lib/Target/CMakeLists.txt
+++ b/llvm/lib/Target/CMakeLists.txt
@@ -2,10 +2,6 @@ list(APPEND LLVM_COMMON_DEPENDS intrinsics_gen)
list(APPEND LLVM_TABLEGEN_FLAGS -I ${LLVM_MAIN_SRC_DIR}/lib/Target)
-if(LLVM_ENABLE_AARCH64_HIP09)
- list(APPEND LLVM_TABLEGEN_FLAGS "-DENABLE_AARCH64_HIP09")
-endif()
-
add_llvm_component_library(LLVMTarget
Target.cpp
TargetIntrinsicInfo.cpp
diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
index 8b23be02edc0..8b1191a5b442 100644
--- a/llvm/lib/TargetParser/Host.cpp
+++ b/llvm/lib/TargetParser/Host.cpp
@@ -257,9 +257,7 @@ StringRef sys::detail::getHostCPUNameForARM(StringRef ProcCpuinfoContent) {
// contents are specified in the various processor manuals.
return StringSwitch<const char *>(Part)
.Case("0xd01", "tsv110")
-#if defined(ENABLE_AARCH64_HIP09)
.Case("0xd02", "hip09")
-#endif
.Default("generic");
if (Implementer == "0x51") // Qualcomm Technologies, Inc.
diff --git a/llvm/test/CodeGen/AArch64/cpus-hip09.ll b/llvm/test/CodeGen/AArch64/cpus-hip09.ll
deleted file mode 100644
index dcf32e4dca89..000000000000
--- a/llvm/test/CodeGen/AArch64/cpus-hip09.ll
+++ /dev/null
@@ -1,11 +0,0 @@
-; REQUIRES: enable_enable_aarch64_hip09
-; This tests that llc accepts all valid AArch64 CPUs
-
-; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=hip09 2>&1 | FileCheck %s
-
-; CHECK-NOT: {{.*}} is not a recognized processor for this target
-; INVALID: {{.*}} is not a recognized processor for this target
-
-define i32 @f(i64 %z) {
- ret i32 0
-}
diff --git a/llvm/test/CodeGen/AArch64/cpus.ll b/llvm/test/CodeGen/AArch64/cpus.ll
index b24866064efa..56772f6c6049 100644
--- a/llvm/test/CodeGen/AArch64/cpus.ll
+++ b/llvm/test/CodeGen/AArch64/cpus.ll
@@ -33,6 +33,7 @@
; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=thunderx2t99 2>&1 | FileCheck %s
; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=thunderx3t110 2>&1 | FileCheck %s
; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=tsv110 2>&1 | FileCheck %s
+; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=hip09 2>&1 | FileCheck %s
; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=apple-latest 2>&1 | FileCheck %s
; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=a64fx 2>&1 | FileCheck %s
; RUN: llc < %s -mtriple=arm64-unknown-unknown -mcpu=ampere1 2>&1 | FileCheck %s
diff --git a/llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir b/llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir
index 64bf159370f9..26ba76ef0af5 100644
--- a/llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir
+++ b/llvm/test/CodeGen/AArch64/macro-fusion-mvnclz.mir
@@ -1,4 +1,3 @@
-# REQUIRES: enable_enable_aarch64_hip09
# RUN: llc -o - %s -mtriple=aarch64-- -mattr=+fuse-mvn-clz -run-pass postmisched | FileCheck %s --check-prefixes=CHECK,FUSION
# RUN: llc -o - %s -mtriple=aarch64-- -mattr=-fuse-mvn-clz -run-pass postmisched | FileCheck %s --check-prefixes=CHECK,NOFUSION
---
diff --git a/llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll b/llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll
deleted file mode 100644
index d67fa5b4374c..000000000000
--- a/llvm/test/CodeGen/AArch64/misched-fusion-lit-hip09.ll
+++ /dev/null
@@ -1,73 +0,0 @@
-; REQUIRES: enable_enable_aarch64_hip09
-; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=hip09 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECKFUSE-HIP09
-
-@g = common local_unnamed_addr global ptr null, align 8
-
-define dso_local ptr @litp(i32 %a, i32 %b) {
-entry:
- %add = add nsw i32 %b, %a
- %idx.ext = sext i32 %add to i64
- %add.ptr = getelementptr i8, ptr @litp, i64 %idx.ext
- store ptr %add.ptr, ptr @g, align 8
- ret ptr %add.ptr
-
-; CHECK-LABEL: litp:
-; CHECK: adrp [[R:x[0-9]+]], litp
-; CHECKFUSE-NEXT: add {{x[0-9]+}}, [[R]], :lo12:litp
-}
-
-define dso_local ptr @litp_tune_generic(i32 %a, i32 %b) "tune-cpu"="generic" {
-entry:
- %add = add nsw i32 %b, %a
- %idx.ext = sext i32 %add to i64
- %add.ptr = getelementptr i8, ptr @litp_tune_generic, i64 %idx.ext
- store ptr %add.ptr, ptr @g, align 8
- ret ptr %add.ptr
-
-; CHECK-LABEL: litp_tune_generic:
-; CHECK: adrp [[R:x[0-9]+]], litp_tune_generic
-; CHECK-NEXT: add {{x[0-9]+}}, [[R]], :lo12:litp_tune_generic
-}
-
-define dso_local i32 @liti(i32 %a, i32 %b) {
-entry:
- %add = add i32 %a, -262095121
- %add1 = add i32 %add, %b
- ret i32 %add1
-
-; CHECK-LABEL: liti:
-; CHECK: mov [[R:w[0-9]+]], {{#[0-9]+}}
-; CHECKDONT-NEXT: add {{w[0-9]+}}, {{w[0-9]+}}, {{w[0-9]+}}
-; CHECKFUSE-NEXT: movk [[R]], {{#[0-9]+}}, lsl #16
-; CHECKFUSE-HIP09: movk [[R]], {{#[0-9]+}}, lsl #16
-}
-
-; Function Attrs: norecurse nounwind readnone
-define dso_local i64 @litl(i64 %a, i64 %b) {
-entry:
- %add = add i64 %a, 2208998440489107183
- %add1 = add i64 %add, %b
- ret i64 %add1
-
-; CHECK-LABEL: litl:
-; CHECK: mov [[R:x[0-9]+]], {{#[0-9]+}}
-; CHECKDONT-NEXT: add {{x[0-9]+}}, {{x[0-9]+}}, {{x[0-9]+}}
-; CHECK-NEXT: movk [[R]], {{#[0-9]+}}, lsl #16
-; CHECK: movk [[R]], {{#[0-9]+}}, lsl #32
-; CHECK-NEXT: movk [[R]], {{#[0-9]+}}, lsl #48
-}
-
-; Function Attrs: norecurse nounwind readnone
-define dso_local double @litf() {
-entry:
- ret double 0x400921FB54442D18
-
-; CHECK-LABEL: litf:
-; CHECK-DONT: adrp [[ADDR:x[0-9]+]], [[CSTLABEL:.LCP.*]]
-; CHECK-DONT-NEXT: ldr {{d[0-9]+}}, {{[[]}}[[ADDR]], :lo12:[[CSTLABEL]]{{[]]}}
-; CHECKFUSE-HIP09: mov [[R:x[0-9]+]], #11544
-; CHECKFUSE-HIP09: movk [[R]], #21572, lsl #16
-; CHECKFUSE-HIP09: movk [[R]], #8699, lsl #32
-; CHECKFUSE-HIP09: movk [[R]], #16393, lsl #48
-; CHECKFUSE-HIP09: fmov {{d[0-9]+}}, [[R]]
-}
diff --git a/llvm/test/CodeGen/AArch64/misched-fusion-lit.ll b/llvm/test/CodeGen/AArch64/misched-fusion-lit.ll
index ad244d30df11..67cc7aa503b6 100644
--- a/llvm/test/CodeGen/AArch64/misched-fusion-lit.ll
+++ b/llvm/test/CodeGen/AArch64/misched-fusion-lit.ll
@@ -7,6 +7,7 @@
; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=exynos-m4 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECKFUSE
; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=exynos-m5 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECKFUSE
; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=neoverse-n1 | FileCheck %s --check-prefix=CHECKFUSE-NEOVERSE
+; RUN: llc %s -o - -mtriple=aarch64-unknown -mcpu=hip09 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECKFUSE-HIP09
@g = common local_unnamed_addr global ptr null, align 8
@@ -59,6 +60,7 @@ entry:
; CHECK: mov [[R:w[0-9]+]], {{#[0-9]+}}
; CHECKDONT-NEXT: add {{w[0-9]+}}, {{w[0-9]+}}, {{w[0-9]+}}
; CHECKFUSE-NEXT: movk [[R]], {{#[0-9]+}}, lsl #16
+; CHECKFUSE-HIP09: movk [[R]], {{#[0-9]+}}, lsl #16
}
; Function Attrs: norecurse nounwind readnone
@@ -89,4 +91,9 @@ entry:
; CHECK-FUSE: movk [[R]], #8699, lsl #32
; CHECK-FUSE: movk [[R]], #16393, lsl #48
; CHECK-FUSE: fmov {{d[0-9]+}}, [[R]]
+; CHECKFUSE-HIP09: mov [[R:x[0-9]+]], #11544
+; CHECKFUSE-HIP09: movk [[R]], #21572, lsl #16
+; CHECKFUSE-HIP09: movk [[R]], #8699, lsl #32
+; CHECKFUSE-HIP09: movk [[R]], #16393, lsl #48
+; CHECKFUSE-HIP09: fmov {{d[0-9]+}}, [[R]]
}
diff --git a/llvm/test/CodeGen/AArch64/remat-hip09.ll b/llvm/test/CodeGen/AArch64/remat-hip09.ll
deleted file mode 100644
index aec0d18ae73f..000000000000
--- a/llvm/test/CodeGen/AArch64/remat-hip09.ll
+++ /dev/null
@@ -1,18 +0,0 @@
-; REQUIRES: enable_enable_aarch64_hip09
-; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=hip09 -o - %s | FileCheck %s
-
-%X = type { i64, i64, i64 }
-declare void @f(ptr)
-define void @t() {
-entry:
- %tmp = alloca %X
- call void @f(ptr %tmp)
-; CHECK: add x0, sp, #8
-; CHECK-NOT: mov
-; CHECK-NEXT: bl f
- call void @f(ptr %tmp)
-; CHECK: add x0, sp, #8
-; CHECK-NOT: mov
-; CHECK-NEXT: bl f
- ret void
-}
diff --git a/llvm/test/CodeGen/AArch64/remat.ll b/llvm/test/CodeGen/AArch64/remat.ll
index 483c4d71ee21..fa039246c7f5 100644
--- a/llvm/test/CodeGen/AArch64/remat.ll
+++ b/llvm/test/CodeGen/AArch64/remat.ll
@@ -22,6 +22,7 @@
; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=kryo -o - %s | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=thunderx2t99 -o - %s | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=tsv110 -o - %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=hip09 -o - %s | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnuabi -mattr=+custom-cheap-as-move -o - %s | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=thunderx3t110 -o - %s | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnuabi -mcpu=ampere1 -o - %s | FileCheck %s
diff --git a/llvm/test/lit.site.cfg.py.in b/llvm/test/lit.site.cfg.py.in
index 6145a514f008..20c1ecca1d43 100644
--- a/llvm/test/lit.site.cfg.py.in
+++ b/llvm/test/lit.site.cfg.py.in
@@ -63,14 +63,10 @@ config.dxil_tests = @LLVM_INCLUDE_DXIL_TESTS@
config.have_llvm_driver = @LLVM_TOOL_LLVM_DRIVER_BUILD@
config.use_classic_flang = @LLVM_ENABLE_CLASSIC_FLANG@
config.enable_enable_autotuner = @LLVM_ENABLE_AUTOTUNER@
-config.enable_enable_aarch64_hip09 = @LLVM_ENABLE_AARCH64_HIP09@
import lit.llvm
lit.llvm.initialize(lit_config, config)
-if config.enable_enable_aarch64_hip09:
- config.available_features.add("enable_enable_aarch64_hip09")
-
# Let the main config do the real work.
lit_config.load_config(
config, os.path.join(config.llvm_src_root, "test/lit.cfg.py"))
diff --git a/llvm/unittests/TargetParser/Host.cpp b/llvm/unittests/TargetParser/Host.cpp
index 4b4c81514896..cfc41486b173 100644
--- a/llvm/unittests/TargetParser/Host.cpp
+++ b/llvm/unittests/TargetParser/Host.cpp
@@ -250,11 +250,9 @@ CPU part : 0x0a1
EXPECT_EQ(sys::detail::getHostCPUNameForARM("CPU implementer : 0x48\n"
"CPU part : 0xd01"),
"tsv110");
-#if defined(ENABLE_AARCH64_HIP09)
EXPECT_EQ(sys::detail::getHostCPUNameForARM("CPU implementer : 0x48\n"
"CPU part : 0xd02"),
"hip09");
-#endif
// Verify A64FX.
const std::string A64FXProcCpuInfo = R"(
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 94e0047e567b..daa38474004e 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -1421,7 +1421,6 @@ INSTANTIATE_TEST_SUITE_P(
AArch64::AEK_PROFILE | AArch64::AEK_FP16 |
AArch64::AEK_FP16FML | AArch64::AEK_DOTPROD,
"8.2-A"),
-#if defined(ENABLE_AARCH64_HIP09)
ARMCPUTestParams(
"hip09", "armv8.5-a", "crypto-neon-fp-armv8",
AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_SIMD |
@@ -1432,7 +1431,6 @@ INSTANTIATE_TEST_SUITE_P(
AArch64::AEK_FP16FML | AArch64::AEK_SVE | AArch64::AEK_I8MM |
AArch64::AEK_F32MM | AArch64::AEK_F64MM | AArch64::AEK_BF16,
"8.5-A"),
-#endif
ARMCPUTestParams("a64fx", "armv8.2-a", "crypto-neon-fp-armv8",
AArch64::AEK_CRC | AArch64::AEK_AES |
AArch64::AEK_SHA2 | AArch64::AEK_FP |
@@ -1449,11 +1447,7 @@ INSTANTIATE_TEST_SUITE_P(
"8.2-A")));
// Note: number of CPUs includes aliases.
-#if defined(ENABLE_AARCH64_HIP09)
static constexpr unsigned NumAArch64CPUArchs = 63;
-#else
-static constexpr unsigned NumAArch64CPUArchs = 62;
-#endif
TEST(TargetParserTest, testAArch64CPUArchList) {
SmallVector<StringRef, NumAArch64CPUArchs> List;
--
2.43.0

View File

@ -0,0 +1,72 @@
From 28e3fc80336935bc8bed372e78616ef5be9f4908 Mon Sep 17 00:00:00 2001
From: Arthur Eubanks <aeubanks@google.com>
Date: Thu, 27 Jul 2023 13:27:58 -0700
Subject: [PATCH] Don't zero out noreg operands
A tail call may have $noreg operands.
Fixes a crash.
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D156485
---
llvm/lib/CodeGen/PrologEpilogInserter.cpp | 9 +++++++--
llvm/test/CodeGen/X86/zero-call-used-regs.ll | 14 ++++++++++++++
2 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
index e323aaaeefaf..49047719fdaa 100644
--- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp
+++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
@@ -1285,6 +1285,8 @@ void PEI::insertZeroCallUsedRegs(MachineFunction &MF) {
continue;
MCRegister Reg = MO.getReg();
+ if (!Reg)
+ continue;
// This picks up sibling registers (e.q. %al -> %ah).
for (MCRegUnit Unit : TRI.regunits(Reg))
@@ -1308,8 +1310,11 @@ void PEI::insertZeroCallUsedRegs(MachineFunction &MF) {
if (!MO.isReg())
continue;
- for (const MCPhysReg &Reg :
- TRI.sub_and_superregs_inclusive(MO.getReg()))
+ MCRegister Reg = MO.getReg();
+ if (!Reg)
+ continue;
+
+ for (const MCPhysReg Reg : TRI.sub_and_superregs_inclusive(Reg))
RegsToZero.reset(Reg);
}
}
diff --git a/llvm/test/CodeGen/X86/zero-call-used-regs.ll b/llvm/test/CodeGen/X86/zero-call-used-regs.ll
index 63d51c916bb9..97ad5ce9c8cb 100644
--- a/llvm/test/CodeGen/X86/zero-call-used-regs.ll
+++ b/llvm/test/CodeGen/X86/zero-call-used-regs.ll
@@ -241,6 +241,20 @@ entry:
ret i32 %x
}
+define dso_local void @tailcall(ptr %p) local_unnamed_addr #0 "zero-call-used-regs"="used-gpr" {
+; I386-LABEL: tailcall:
+; I386: # %bb.0:
+; I386-NEXT: movl {{[0-9]+}}(%esp), %eax
+; I386-NEXT: jmpl *(%eax) # TAILCALL
+;
+; X86-64-LABEL: tailcall:
+; X86-64: # %bb.0:
+; X86-64-NEXT: jmpq *(%rdi) # TAILCALL
+ %c = load ptr, ptr %p
+ tail call void %c()
+ ret void
+}
+
; Don't emit zeroing registers in "main" function.
define dso_local i32 @main() local_unnamed_addr #1 {
; I386-LABEL: main:
--
2.43.0

View File

@ -0,0 +1,246 @@
From 60ff801d1ea96ab964039cc1ed42e1dca0a63d54 Mon Sep 17 00:00:00 2001
From: Anton Sidorenko <anton.sidorenko@syntacore.com>
Date: Tue, 6 Feb 2024 12:02:06 +0300
Subject: [PATCH] [SimplifyLibCalls] Merge sqrt into the power of exp (#79146)
Under fast-math flags it's possible to convert `sqrt(exp(X)) `into
`exp(X * 0.5)`. I suppose that this transformation is always profitable.
This is similar to the optimization existing in GCC.
---
.../llvm/Transforms/Utils/SimplifyLibCalls.h | 1 +
.../lib/Transforms/Utils/SimplifyLibCalls.cpp | 67 ++++++++++
llvm/test/Transforms/InstCombine/sqrt.ll | 120 ++++++++++++++++++
3 files changed, 188 insertions(+)
diff --git a/llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h b/llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h
index eb10545ee149..1aad0b298845 100644
--- a/llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h
+++ b/llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h
@@ -201,6 +201,7 @@ private:
Value *optimizeFMinFMax(CallInst *CI, IRBuilderBase &B);
Value *optimizeLog(CallInst *CI, IRBuilderBase &B);
Value *optimizeSqrt(CallInst *CI, IRBuilderBase &B);
+ Value *mergeSqrtToExp(CallInst *CI, IRBuilderBase &B);
Value *optimizeSinCosPi(CallInst *CI, bool IsSin, IRBuilderBase &B);
Value *optimizeTan(CallInst *CI, IRBuilderBase &B);
// Wrapper for all floating point library call optimizations
diff --git a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
index 3ad97613fe7a..dd5bbdaaf6d3 100644
--- a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
@@ -2539,6 +2539,70 @@ Value *LibCallSimplifier::optimizeLog(CallInst *Log, IRBuilderBase &B) {
return Ret;
}
+// sqrt(exp(X)) -> exp(X * 0.5)
+Value *LibCallSimplifier::mergeSqrtToExp(CallInst *CI, IRBuilderBase &B) {
+ if (!CI->hasAllowReassoc())
+ return nullptr;
+
+ Function *SqrtFn = CI->getCalledFunction();
+ CallInst *Arg = dyn_cast<CallInst>(CI->getArgOperand(0));
+ if (!Arg || !Arg->hasAllowReassoc() || !Arg->hasOneUse())
+ return nullptr;
+ Intrinsic::ID ArgID = Arg->getIntrinsicID();
+ LibFunc ArgLb = NotLibFunc;
+ TLI->getLibFunc(*Arg, ArgLb);
+
+ LibFunc SqrtLb, ExpLb, Exp2Lb, Exp10Lb;
+
+ if (TLI->getLibFunc(SqrtFn->getName(), SqrtLb))
+ switch (SqrtLb) {
+ case LibFunc_sqrtf:
+ ExpLb = LibFunc_expf;
+ Exp2Lb = LibFunc_exp2f;
+ Exp10Lb = LibFunc_exp10f;
+ break;
+ case LibFunc_sqrt:
+ ExpLb = LibFunc_exp;
+ Exp2Lb = LibFunc_exp2;
+ Exp10Lb = LibFunc_exp10;
+ break;
+ case LibFunc_sqrtl:
+ ExpLb = LibFunc_expl;
+ Exp2Lb = LibFunc_exp2l;
+ Exp10Lb = LibFunc_exp10l;
+ break;
+ default:
+ return nullptr;
+ }
+ else if (SqrtFn->getIntrinsicID() == Intrinsic::sqrt) {
+ if (CI->getType()->getScalarType()->isFloatTy()) {
+ ExpLb = LibFunc_expf;
+ Exp2Lb = LibFunc_exp2f;
+ Exp10Lb = LibFunc_exp10f;
+ } else if (CI->getType()->getScalarType()->isDoubleTy()) {
+ ExpLb = LibFunc_exp;
+ Exp2Lb = LibFunc_exp2;
+ Exp10Lb = LibFunc_exp10;
+ } else
+ return nullptr;
+ } else
+ return nullptr;
+
+ if (ArgLb != ExpLb && ArgLb != Exp2Lb && ArgLb != Exp10Lb &&
+ ArgID != Intrinsic::exp && ArgID != Intrinsic::exp2)
+ return nullptr;
+
+ IRBuilderBase::InsertPointGuard Guard(B);
+ B.SetInsertPoint(Arg);
+ auto *ExpOperand = Arg->getOperand(0);
+ auto *FMul =
+ B.CreateFMulFMF(ExpOperand, ConstantFP::get(ExpOperand->getType(), 0.5),
+ CI, "merged.sqrt");
+
+ Arg->setOperand(0, FMul);
+ return Arg;
+}
+
Value *LibCallSimplifier::optimizeSqrt(CallInst *CI, IRBuilderBase &B) {
Module *M = CI->getModule();
Function *Callee = CI->getCalledFunction();
@@ -2551,6 +2615,9 @@ Value *LibCallSimplifier::optimizeSqrt(CallInst *CI, IRBuilderBase &B) {
Callee->getIntrinsicID() == Intrinsic::sqrt))
Ret = optimizeUnaryDoubleFP(CI, B, TLI, true);
+ if (Value *Opt = mergeSqrtToExp(CI, B))
+ return Opt;
+
if (!CI->isFast())
return Ret;
diff --git a/llvm/test/Transforms/InstCombine/sqrt.ll b/llvm/test/Transforms/InstCombine/sqrt.ll
index 004df3e30c72..f72fe5a6a581 100644
--- a/llvm/test/Transforms/InstCombine/sqrt.ll
+++ b/llvm/test/Transforms/InstCombine/sqrt.ll
@@ -88,7 +88,127 @@ define float @sqrt_call_fabs_f32(float %x) {
ret float %sqrt
}
+define double @sqrt_exp(double %x) {
+; CHECK-LABEL: @sqrt_exp(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc double [[X:%.*]], 5.000000e-01
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @llvm.exp.f64(double [[MERGED_SQRT]])
+; CHECK-NEXT: ret double [[E]]
+;
+ %e = call reassoc double @llvm.exp.f64(double %x)
+ %res = call reassoc double @llvm.sqrt.f64(double %e)
+ ret double %res
+}
+
+define double @sqrt_exp_2(double %x) {
+; CHECK-LABEL: @sqrt_exp_2(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc double [[X:%.*]], 5.000000e-01
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @exp(double [[MERGED_SQRT]])
+; CHECK-NEXT: ret double [[E]]
+;
+ %e = call reassoc double @exp(double %x)
+ %res = call reassoc double @sqrt(double %e)
+ ret double %res
+}
+
+define double @sqrt_exp2(double %x) {
+; CHECK-LABEL: @sqrt_exp2(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc double [[X:%.*]], 5.000000e-01
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @exp2(double [[MERGED_SQRT]])
+; CHECK-NEXT: ret double [[E]]
+;
+ %e = call reassoc double @exp2(double %x)
+ %res = call reassoc double @sqrt(double %e)
+ ret double %res
+}
+
+define double @sqrt_exp10(double %x) {
+; CHECK-LABEL: @sqrt_exp10(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc double [[X:%.*]], 5.000000e-01
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @exp10(double [[MERGED_SQRT]])
+; CHECK-NEXT: ret double [[E]]
+;
+ %e = call reassoc double @exp10(double %x)
+ %res = call reassoc double @sqrt(double %e)
+ ret double %res
+}
+
+; Negative test
+define double @sqrt_exp_nofast_1(double %x) {
+; CHECK-LABEL: @sqrt_exp_nofast_1(
+; CHECK-NEXT: [[E:%.*]] = call double @llvm.exp.f64(double [[X:%.*]])
+; CHECK-NEXT: [[RES:%.*]] = call reassoc double @llvm.sqrt.f64(double [[E]])
+; CHECK-NEXT: ret double [[RES]]
+;
+ %e = call double @llvm.exp.f64(double %x)
+ %res = call reassoc double @llvm.sqrt.f64(double %e)
+ ret double %res
+}
+
+; Negative test
+define double @sqrt_exp_nofast_2(double %x) {
+; CHECK-LABEL: @sqrt_exp_nofast_2(
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @llvm.exp.f64(double [[X:%.*]])
+; CHECK-NEXT: [[RES:%.*]] = call double @llvm.sqrt.f64(double [[E]])
+; CHECK-NEXT: ret double [[RES]]
+;
+ %e = call reassoc double @llvm.exp.f64(double %x)
+ %res = call double @llvm.sqrt.f64(double %e)
+ ret double %res
+}
+
+define double @sqrt_exp_merge_constant(double %x, double %y) {
+; CHECK-LABEL: @sqrt_exp_merge_constant(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc nsz double [[X:%.*]], 5.000000e+00
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @llvm.exp.f64(double [[MERGED_SQRT]])
+; CHECK-NEXT: ret double [[E]]
+;
+ %mul = fmul reassoc nsz double %x, 10.0
+ %e = call reassoc double @llvm.exp.f64(double %mul)
+ %res = call reassoc nsz double @llvm.sqrt.f64(double %e)
+ ret double %res
+}
+
+define double @sqrt_exp_intr_and_libcall(double %x) {
+; CHECK-LABEL: @sqrt_exp_intr_and_libcall(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc double [[X:%.*]], 5.000000e-01
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @exp(double [[MERGED_SQRT]])
+; CHECK-NEXT: ret double [[E]]
+;
+ %e = call reassoc double @exp(double %x)
+ %res = call reassoc double @llvm.sqrt.f64(double %e)
+ ret double %res
+}
+
+define double @sqrt_exp_intr_and_libcall_2(double %x) {
+; CHECK-LABEL: @sqrt_exp_intr_and_libcall_2(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc double [[X:%.*]], 5.000000e-01
+; CHECK-NEXT: [[E:%.*]] = call reassoc double @llvm.exp.f64(double [[MERGED_SQRT]])
+; CHECK-NEXT: ret double [[E]]
+;
+ %e = call reassoc double @llvm.exp.f64(double %x)
+ %res = call reassoc double @sqrt(double %e)
+ ret double %res
+}
+
+define <2 x float> @sqrt_exp_vec(<2 x float> %x) {
+; CHECK-LABEL: @sqrt_exp_vec(
+; CHECK-NEXT: [[MERGED_SQRT:%.*]] = fmul reassoc <2 x float> [[X:%.*]], <float 5.000000e-01, float 5.000000e-01>
+; CHECK-NEXT: [[E:%.*]] = call reassoc <2 x float> @llvm.exp.v2f32(<2 x float> [[MERGED_SQRT]])
+; CHECK-NEXT: ret <2 x float> [[E]]
+;
+ %e = call reassoc <2 x float> @llvm.exp.v2f32(<2 x float> %x)
+ %res = call reassoc <2 x float> @llvm.sqrt.v2f32(<2 x float> %e)
+ ret <2 x float> %res
+}
+
declare i32 @foo(double)
declare double @sqrt(double) readnone
declare float @sqrtf(float)
declare float @llvm.fabs.f32(float)
+declare double @llvm.exp.f64(double)
+declare double @llvm.sqrt.f64(double)
+declare double @exp(double)
+declare double @exp2(double)
+declare double @exp10(double)
+declare <2 x float> @llvm.exp.v2f32(<2 x float>)
+declare <2 x float> @llvm.sqrt.v2f32(<2 x float>)
--
2.38.1.windows.1

View File

@ -0,0 +1,187 @@
From fdbf1bd9f1bdec32384eda47f419d895d11a1c50 Mon Sep 17 00:00:00 2001
From: XingYuShuai <1150775134@qq.com>
Date: Wed, 15 May 2024 14:42:27 +0800
Subject: [PATCH] [LICM] Solve runtime error caused by the signal function.
Using the option enable-signal to control whether to solve the
runtime error caused by the signal function when lto is turned on.
---
llvm/cmake/modules/HandleLLVMOptions.cmake | 8 ++++
llvm/lib/Transforms/Scalar/LICM.cpp | 47 +++++++++++++++++++
.../Transforms/LICM/signal-before-loop-2.ll | 25 ++++++++++
.../Transforms/LICM/signal-before-loop.ll | 25 ++++++++++
llvm/test/lit.site.cfg.py.in | 1 +
5 files changed, 106 insertions(+)
create mode 100644 llvm/test/Transforms/LICM/signal-before-loop-2.ll
create mode 100644 llvm/test/Transforms/LICM/signal-before-loop.ll
diff --git a/llvm/cmake/modules/HandleLLVMOptions.cmake b/llvm/cmake/modules/HandleLLVMOptions.cmake
index b8e9dbe29d88..8be5d4ba52c2 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -120,6 +120,14 @@ else()
set(LLVM_ENABLE_AUTOTUNER 0)
endif()
+option(LLVM_BUILD_FOR_COMMON "" ON)
+if(LLVM_BUILD_FOR_COMMON)
+ set(LLVM_BUILD_FOR_COMMON 1)
+ add_definitions( -DBUILD_FOR_COMMON )
+else()
+ set(LLVM_BUILD_FOR_COMMON 0)
+endif()
+
if(LLVM_ENABLE_EXPENSIVE_CHECKS)
add_compile_definitions(EXPENSIVE_CHECKS)
diff --git a/llvm/lib/Transforms/Scalar/LICM.cpp b/llvm/lib/Transforms/Scalar/LICM.cpp
index f8fab03f151d..2feec759f240 100644
--- a/llvm/lib/Transforms/Scalar/LICM.cpp
+++ b/llvm/lib/Transforms/Scalar/LICM.cpp
@@ -44,6 +44,9 @@
#include "llvm/Analysis/AliasSetTracker.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/CaptureTracking.h"
+#ifdef BUILD_FOR_COMMON
+#include "llvm/Analysis/CFG.h"
+#endif // BUILD_FOR_COMMON
#include "llvm/Analysis/GuardUtils.h"
#include "llvm/Analysis/LazyBlockFrequencyInfo.h"
#include "llvm/Analysis/Loads.h"
@@ -122,6 +125,13 @@ static cl::opt<bool>
SingleThread("licm-force-thread-model-single", cl::Hidden, cl::init(false),
cl::desc("Force thread model single in LICM pass"));
+#ifdef BUILD_FOR_COMMON
+static cl::opt<bool> DisableMovStoreInsOutsideOfLoopInSigFun(
+ "disable-move-store-ins-outside-of-loop",
+ cl::Hidden, cl::init(true), cl::desc("Disable move store instruction"
+ "outside of loop in signal function."));
+#endif // BUILD_FOR_COMMON
+
static cl::opt<uint32_t> MaxNumUsesTraversed(
"licm-max-num-uses-traversed", cl::Hidden, cl::init(8),
cl::desc("Max num uses visited for identifying load "
@@ -2075,8 +2085,45 @@ bool llvm::promoteLoopAccessesToScalars(
for (Use &U : ASIV->uses()) {
// Ignore instructions that are outside the loop.
Instruction *UI = dyn_cast<Instruction>(U.getUser());
+ #if defined(BUILD_FOR_COMMON)
+ if (DisableMovStoreInsOutsideOfLoopInSigFun) {
+ if (!UI)
+ continue;
+
+ // In the following scenario, there will be a loop index store
+ // instruction that is moved outside the loop and when the termination
+ // loop is triggered by the signal function, the store instruction is not
+ // executed.However, the function registered by the signal will read the
+ // data sored in the store instruction, so the data read is incorrect.
+ // Solution: Prevent the store instruction form going outside the loop.
+ // NOTE: The sys_signal function takes the same arguments and performs
+ // the same task as signal. They all belong to glic.
+ if(StoreSafety == StoreSafe && !CurLoop->contains(UI)) {
+ if(LoadInst *NotCurLoopLoad = dyn_cast<LoadInst>(UI)) {
+ Function *NotCurLoopFun = UI->getParent()->getParent();
+ for (Use &UseFun : NotCurLoopFun->uses()) {
+ CallInst *Call = dyn_cast<CallInst>(UseFun.getUser());
+ if (Call && Call->getCalledFunction() &&
+ (Call->getCalledFunction()->getName() == "__sysv_signal" ||
+ Call->getCalledFunction()->getName() == "signal") &&
+ isPotentiallyReachable(Call->getParent(),
+ CurLoop->getLoopPreheader(),NULL,DT,
+ LI))
+ return false;
+ }
+ }
+ }
+
+ if (!CurLoop->contains(UI))
+ continue;
+ } else {
+ if (!UI || !CurLoop->contains(UI))
+ continue;
+ }
+#else
if (!UI || !CurLoop->contains(UI))
continue;
+#endif // BUILD_FOR_COMMON
// If there is an non-load/store instruction in the loop, we can't promote
// it.
diff --git a/llvm/test/Transforms/LICM/signal-before-loop-2.ll b/llvm/test/Transforms/LICM/signal-before-loop-2.ll
new file mode 100644
index 000000000000..da878c6c691b
--- /dev/null
+++ b/llvm/test/Transforms/LICM/signal-before-loop-2.ll
@@ -0,0 +1,25 @@
+; REQUIRES: enable_build_for_common
+; RUN:opt -disable-move-store-ins-outside-of-loop=true -S < %s | FileCheck %s
+
+@Run_Index = external global i64
+
+declare ptr @signal(ptr)
+
+define void @report() {
+entry:
+ %0 = load i64, ptr @Run_Index, align 8
+ unreachable
+}
+
+define i32 @main() {
+if.end:
+ %call.i4 = call ptr @signal(ptr @report)
+ br label %for.cond
+
+; CHECK-LABEL: for.cond
+; CHECK: store
+for.cond:
+ %0 = load i64, ptr @Run_Index, align 8
+ store i64 %0, ptr @Run_Index, align 8
+ br label %for.cond
+}
diff --git a/llvm/test/Transforms/LICM/signal-before-loop.ll b/llvm/test/Transforms/LICM/signal-before-loop.ll
new file mode 100644
index 000000000000..cfae4e87db56
--- /dev/null
+++ b/llvm/test/Transforms/LICM/signal-before-loop.ll
@@ -0,0 +1,25 @@
+; REQUIRES: enable_build_for_common
+; RUN:opt -disable-move-store-ins-outside-of-loop=true -S < %s | FileCheck %s
+
+@Run_Index = external global i64
+
+declare ptr @__sysv_signal(ptr)
+
+define void @report() {
+entry:
+ %0 = load i64, ptr @Run_Index, align 8
+ unreachable
+}
+
+define i32 @main() {
+if.end:
+ %call.i4 = call ptr @__sysv_signal(ptr @report)
+ br label %for.cond
+
+; CHECK-LABEL: for.cond
+; CHECK: store
+for.cond:
+ %0 = load i64, ptr @Run_Index, align 8
+ store i64 %0, ptr @Run_Index, align 8
+ br label %for.cond
+}
diff --git a/llvm/test/lit.site.cfg.py.in b/llvm/test/lit.site.cfg.py.in
index 0e9396e3b014..20c1ecca1d43 100644
--- a/llvm/test/lit.site.cfg.py.in
+++ b/llvm/test/lit.site.cfg.py.in
@@ -63,6 +63,7 @@ config.dxil_tests = @LLVM_INCLUDE_DXIL_TESTS@
config.have_llvm_driver = @LLVM_TOOL_LLVM_DRIVER_BUILD@
config.use_classic_flang = @LLVM_ENABLE_CLASSIC_FLANG@
config.enable_enable_autotuner = @LLVM_ENABLE_AUTOTUNER@
+config.enable_build_for_common = @LLVM_BUILD_FOR_COMMON@
import lit.llvm
lit.llvm.initialize(lit_config, config)
--
2.38.1.windows.1

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,34 @@
From d4cfa4fd4496735ea45afcd2b0cfb3607cadd1c9 Mon Sep 17 00:00:00 2001
From: yinrun <lvyinrun@huawei.com>
Date: Thu, 17 Oct 2024 18:47:40 +0800
Subject: [PATCH] Find Python3 in default env PATH for ACPO
Enable the use of user python version, avoid the wrong version of python without AI infra.
---
llvm/lib/Analysis/ACPOMLInterface.cpp | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/Analysis/ACPOMLInterface.cpp b/llvm/lib/Analysis/ACPOMLInterface.cpp
index f48eb46638e3..7d84bd5112d6 100644
--- a/llvm/lib/Analysis/ACPOMLInterface.cpp
+++ b/llvm/lib/Analysis/ACPOMLInterface.cpp
@@ -146,7 +146,15 @@ ACPOMLPythonInterface::ACPOMLPythonInterface() : NextID{0} {
}
int32_t PID = (int32_t) llvm::sys::Process::getProcessId();
- std::string ExecPython = "/usr/bin/python3";
+ std::string ExecPython;
+ llvm::ErrorOr<std::string> Res = llvm::sys::findProgramByName("python3");
+ if (std::error_code EC = Res.getError()) {
+ LLVM_DEBUG(dbgs() << "python3 could not be found, error_code " << EC.value() << "\n");
+ return;
+ } else {
+ ExecPython = Res.get();
+ LLVM_DEBUG(dbgs() << "python3 version found in " << ExecPython << "\n");
+ }
std::string
PythonScript = *Env + "/" + std::string(ACPO_ML_PYTHON_INTERFACE_PY);
std::string PIDStr = std::to_string(PID);
--
2.38.1.windows.1

File diff suppressed because it is too large Load Diff

154
llvm.spec
View File

@ -1,5 +1,13 @@
%bcond_without sys_llvm
%bcond_without check
%bcond_with classic_flang
%bcond_with toolchain_clang
%bcond_without bisheng_autotuner
%bcond_without ACPO
%if %{with toolchain_clang}
%global toolchain clang
%endif
%global maj_ver 17
%global min_ver 0
@ -37,7 +45,7 @@
Name: %{pkg_name}
Version: %{maj_ver}.%{min_ver}.%{patch_ver}
Release: 1
Release: 28
Summary: The Low Level Virtual Machine
License: NCSA
@ -46,6 +54,43 @@ Source0: https://github.com/llvm/llvm-project/releases/download/llvmorg-%{versio
Source1: https://github.com/llvm/llvm-project/releases/download/llvmorg-%{version}/cmake-%{version}.src.tar.xz
Source2: https://github.com/llvm/llvm-project/releases/download/llvmorg-%{version}/third-party-%{version}.src.tar.xz
# Patch{1-10} for supporting `relax` feture on LoongArch, which is consistent with !47 in openEuler repos
Patch1: 0001-Backport-LoongArch-Add-relax-feature-and-keep-relocations.patch
Patch2: 0002-Backport-LoongArch-Allow-delayed-decision-for-ADD-SUB-relocations.patch
Patch3: 0003-Backport-LoongArch-Emit-R_LARCH_RELAX-when-expanding-some-LoadAddress.patch
Patch4: 0004-Backport-MC-LoongArch-Add-AlignFragment-size-if-layout-is-available-and-not-need-insert-nops.patch
Patch5: 0005-Backport-LoongArch-RISCV-Support-R_LARCH_-ADD-SUB-_ULEB128-R_RISCV_-SET-SUB-_ULEB128-for-uleb128-directives.patch
Patch6: 0006-Backport-LoongArch-Add-relaxDwarfLineAddr-and-relaxDwarfCFA-to-handle-the-mutable-label-diff-in-dwarfinfo.patch
Patch7: 0007-Backport-LoongArch-Insert-nops-and-emit-align-reloc-when-handle-alignment-directive.patch
Patch8: 0008-Backport-test-Update-dwarf-loongarch-relocs.ll.patch
Patch9: 0009-Backport-MC-test-Change-ELF-uleb-ehtable.s-Mach-O-to-use-private-symbols-in-.uleb128-for-label-differences.patch
Patch10: 0010-Backport-Mips-MC-AttemptToFoldSymbolOffsetDifference-revert-isMicroMips-special-case.patch
Patch11: 0011-Backport-LoongArch-Add-the-support-for-vector-in-llvm17.patch
Patch12: 0012-Backport-LoongArch-improve-the-support-for-compiler-rt-and-bugfix.patch
Patch13: 0013-Backport-Bitcode-Add-some-missing-GetTypeByID-failure-checks.patch
Patch14: 0014-Backport-X86-Inline-Skip-inline-asm-in-inlining-targ.patch
Patch15: 0015-Backport-ARM-Check-all-terms-in-emitPopInst-when-clearing-Res.patch
Patch16: 0016-Backport-ARM-Update-IsRestored-for-LR-based-on-all-returns-82.patch
Patch17: 0017-Add-the-support-for-classic-flang.patch
Patch18: 0018-Fix-declaration-definition-mismatch-for-classic-flang.patch
Patch19: 0019-Backport-LoongArch-Improve-the-support-for-atomic-and-clear_cache.patch
Patch20: 0020-Update-llvm-lit-config-to-support-build_for_openeule.patch
Patch21: 0021-Add-BiSheng-Autotuner-support-for-LLVM-compiler.patch
Patch22: 0022-Prevent-environment-variables-from-exceeding-NAME_MA.patch
Patch23: 0023-AArch64-Support-HiSilicon-s-HIP09-Processor.patch
Patch24: 0024-Backport-LoongArch-fix-and-add-some-new-support.patch
Patch25: 0025-Backport-Simple-check-to-ignore-Inline-asm-fwait-insertion.patch
Patch26: 0026-Add-arch-restriction-for-BiSheng-Autotuner.patch
Patch27: 0027-AArch64-Delete-hip09-macro.patch
Patch28: 0028-backport-Clang-Fix-crash-with-fzero-call-used-regs.patch
Patch29: 0029-SimplifyLibCalls-Merge-sqrt-into-the-power-of-exp-79.patch
Patch30: 0030-LICM-Solve-runtime-error-caused-by-the-signal-functi.patch
Patch31: 0031-ACPO-ACPO-Infrastructure.patch
Patch32: 0032-ACPO-Introduce-MLInliner-using-ACPO-infrastructure.patch
Patch33: 0033-Find-Python3-in-default-env-PATH-for-ACPO.patch
Patch34: 0034-AArch64-Support-HiSilicon-s-HIP09-sched-model.patch
BuildRequires: binutils-devel
BuildRequires: cmake
BuildRequires: gcc
@ -61,6 +106,9 @@ BuildRequires: python3-recommonmark
BuildRequires: python3-sphinx
BuildRequires: python3-setuptools
BuildRequires: zlib-devel
%if %{with toolchain_clang}
BuildRequires: clang
%endif
Requires: %{name}-libs%{?_isa} = %{version}-%{release}
@ -97,6 +145,8 @@ programs that use the LLVM infrastructure.
Summary: Documentation for LLVM
BuildArch: noarch
Requires: %{name} = %{version}-%{release}
Provides: %{name}-help = %{version}-%{release}
Obsoletes: %{name}-help < %{version}-%{release}
%description doc
Documentation for the LLVM compiler infrastructure.
@ -158,6 +208,12 @@ pathfix.py -i %{__python3} -pn \
mkdir -p _build
cd _build
%if %{with ACPO}
echo "enable ACPO"
export CFLAGS="-Wp,-DENABLE_ACPO ${CFLAGS}"
export CXXFLAGS="-Wp,-DENABLE_ACPO ${CXXFLAGS}"
%endif
%cmake .. -G Ninja \
-DBUILD_SHARED_LIBS:BOOL=OFF \
-DLLVM_PARALLEL_LINK_JOBS=%{max_link_jobs} \
@ -203,9 +259,18 @@ cd _build
-DLLVM_LIBDIR_SUFFIX=64 \
%else
-DLLVM_LIBDIR_SUFFIX= \
%endif
%if %{with classic_flang}
-DLLVM_ENABLE_CLASSIC_FLANG=ON \
%endif
%if "%{toolchain}" == "clang"
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
%endif
%if %{with bisheng_autotuner}
-DLLVM_ENABLE_AUTOTUNER=ON \
%endif
-DLLVM_INCLUDE_BENCHMARKS=OFF
%ninja_build LLVM
%ninja_build
@ -266,7 +331,6 @@ LD_LIBRARY_PATH=%{buildroot}/%{install_libdir} %{__ninja} check-all -C ./_build
%files
%license LICENSE.TXT
%{install_prefix}/share/man/man1/*
%{install_bindir}/*
%exclude %{install_bindir}/not
%exclude %{install_bindir}/count
@ -296,6 +360,7 @@ LD_LIBRARY_PATH=%{buildroot}/%{install_libdir} %{__ninja} check-all -C ./_build
%files doc
%license LICENSE.TXT
%doc %{install_docdir}/html
%{install_prefix}/share/man/man1/*
%files static
%license LICENSE.TXT
@ -327,10 +392,91 @@ LD_LIBRARY_PATH=%{buildroot}/%{install_libdir} %{__ninja} check-all -C ./_build
%{install_includedir}/llvm-gmock
%changelog
* Fri Nov 22 2024 xiajingze <xiajingze1@huawei.com> - 17.0.6-28
- [AArch64] Support HiSilicon's HIP09 sched model
* Wed Nov 20 2024 eastb233 <xiezhiheng@huawei.com> - 17.0.6-27
- Find Python3 in default env PATH for ACPO
* Wed Nov 20 2024 eastb233 <xiezhiheng@huawei.com> - 17.0.6-26
- ACPO Infrastructure for ML integration into LLVM compiler
* Wed Nov 20 2024 eastb233 <xiezhiheng@huawei.com> - 17.0.6-25
- [LICM] Solve runtime error caused by the signal function.
* Wed Nov 20 2024 eastb233 <xiezhiheng@huawei.com> - 17.0.6-24
- [SimplifyLibCalls] Merge sqrt into the power of exp (#79146)
* Tue Nov 19 2024 xiajingze <xiajingze1@huawei.com> - 17.0.6-23
- [backport][Clang] Fix crash with -fzero-call-used-regs
* Mon Nov 18 2024 xiajingze <xiajingze1@huawei.com> - 17.0.6-22
- [AArch64] Delete hip09 macro
* Mon Nov 18 2024 liyunfei <liyunfei33@huawei.net> - 17.0.6-21
- Add arch restriction for BiSheng Autotuner
* Mon Nov 18 2024 liyunfei <liyunfei33@huawei.net> - 17.0.6-20
- [Backport] Simple check to ignore Inline asm fwait insertion
* Mon Sep 23 2024 zhanglimin <zhanglimin@loongson.cn> - 17.0.6-19
- [LoongArch] Backport some new support
* Thu Sep 12 2024 xiajingze <xiajingze1@huawei.com> - 17.0.6-18
- [AArch64] Support HiSilicon's HIP09 Processor
* Wed Sep 11 2024 hongjinghao <hongjinghao@huawei.com> - 17.0.6-17
- doc add Provides llvm-help
* Tue Sep 10 2024 hongjinghao <hongjinghao@huawei.com> - 17.0.6-16
- doc add Obsoletes llvm-help
* Tue Sep 3 2024 hongjinghao <hongjinghao@huawei.com> - 17.0.6-15
- mv man to doc subpackage
* Mon Jul 22 2024 liyunfei <liyunfei33@huawei.com> - 17.0.6-14
- Prevent environment variables from exceeding NAME_MAX.
* Mon Jul 22 2024 liyunfei <liyunfei33@huawei.com> - 17.0.6-13
- Disable toolchain_clang build for BiSheng Autotuner support temporary.
* Tue Jul 16 2024 liyunfei <liyunfei33@huawei.com> - 17.0.6-12
- Add BiSheng Autotuner support.
* Fri Jul 5 2024 liyunfei <liyunfei33@huawei.com> - 17.0.6-11
- Add toolchain_clang build support
* Mon Apr 29 2024 wangqiang <wangqiang1@kylinos.cn> - 17.0.6-10
- Update llvm-lit config to support macro `build_for_openeuler`
* Sun Apr 21 2024 zhanglimin <zhanglimin@loongson.cn> - 17.0.6-9
- Improve the support for atomic and __clear_cache
* Wed Apr 17 2024 luofeng <luofeng13@huawei.com> - 17.0.6-8
- Add the support for classic flang
* Fri Apr 12 2024 liyunfei <liyunfei33@huawei.com> - 17.0.6-7
- Backport patch to fix CVE-2024-31852
* Thu Apr 11 2024 wangqiang <wangqiang1@kylinos.cn> - 17.0.6-6
- Skip inline asm in inlining target feature check on X86
* Tue Apr 09 2024 liyunfei <liyunfei33@huawei.com> - 17.0.6-5
- Backport patch to fix CVE-2023-46049
* Wed Apr 03 2024 zhanglimin <zhanglimin@loongson.cn> - 17.0.6-4
- Improve the support for compiler-rt and fix some bugs on LoongArch
* Fri Mar 29 2024 zhanglimin <zhanglimin@loongson.cn> - 17.0.6-3
- Add the support for vector on LoongArch
* Sat Mar 16 2024 zhanglimin <zhanglimin@loongson.cn> - 17.0.6-2
- Supoort `relax` feature on LoongArch
* Thu Nov 30 2023 zhoujing <zhoujing106@huawei.com> - 17.0.6-1
- Update to 17.0.6
* Tue Jul 13 2023 cf-zhao <zhaochuanfeng@huawei.com> -12.0.1-7
* Thu Jul 13 2023 cf-zhao <zhaochuanfeng@huawei.com> -12.0.1-7
- Disable check.
* Sat Jul 08 2023 cf-zhao <zhaochuanfeng@huawei.com> -12.0.1-6